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INTRODUCTION 


A  primary  goal  of  this  research  is  to  develop  adaptive  signal  pro¬ 
cessing  algorithms  that  will  be  useful  in  providing  anti  Jam  A/J  protection  for 
aircraft  receiving  systems.  Because  of  the  motion  of  the  aircraft  and  the 
uncertain  position  of  the  signal  source,  signal  reception  niay  be  possible 
from  almost  any  direction  of  incidence.  The  uncertainty  and  time  variable 
nature  of  jammer  positions  requires  rapid  adaptive  capability  for  elimination 
of  one  or  more  simultaneously  operating  Jammers.  Furthermore,  aircraft  receiv¬ 
ing  arrays  generally  have  only  a  few  elements,  each  having  highly  irregular  and 
sometimes  unpredictable  radiation  patterns.  The  problem  Is  not  simple. 

This  report  is  divided  Into  three  parts,  each  one  representing  a 
•wejor  effort  contributing  to  adaptive  A/J  technology. 

Part  A  describes  single  channel  algorithms  for  j^^l^rating  signals 
based  on  their  power  levels.  Jamming  signals, when  of  concern,  are 
generally  large  In  amplitude.  By  siphoning  off  the  strongest  Input 
components,  the  desired  signal  can  be  decoded  from  the  remainder.  Adaptive 
signal  processors  are  proposed,  analyzed,  and  computer  simulated  that  have 
the  capability  of  separating  signals  by  power  with  controllable  SNR 
slicing  thresholds.  The  "ABWIN"  algorithm,  the  first  conceived,  requires 
injection  of  synthetic  noise  of  controllable  amplitude  (to  control  slicing 
threshold).  An  improved  "ABMAIN"  is  also  described.  This  algorithm  is 
quieter  and  simpler  to  implement.  The  effects  of  the  synthetic  noise 
ere  obtained  algorithmically.  The  analysis  proves  stability  conditions, 
determines  rate  of  convergence,  and  determines  noise  in  the  adaptive  filter 
weight  vector  and  its  effects  on  system  performance.  This  approach  is  usable 
for  signal  separation  when  the  signals  are  narrowband  with  non-overlapping 


passbands.  When  these  passbands  overlap,  separation  would  only  be  possible 
with  a  multichannel  system  connected  to  an  array  of  antenna  elements 
rather  than  to  a  single  element.  Development  of  multichannel  adaptive 
pw^r  separators  has  been  proposed  for  future  work. 

Part  B  of  this  report  describes,  simulates,  and  analyzes  an  adaptive 
antenna  scheme  that  sustains  (via  a  "soft  constraint")  an  approximately 
uniform  sensitivity  in  all  directions  except  those  corresponding  to 
arrival  directions  of  strong  signals  (which  presumably  are  Jammers). 

The  threshold  level  dividing  strong  and  weak  signals  is  controllable. 

This  scheme  has  never  been  tried  before  and  appears  to  be  quite  workable 
and  simple.  Computer  simulations  show  that  several  strong  jammers  (which 
may  be  either  narrowband  or  broadband)  can  be  eliminated  simultaneously 
when  the  antenna  array  contains  only  a  few  elements.  Irregularities  in 
the  Individual  element  patterns  cause  nonuniformity  in  the  overall  system 
receiving  pattern,  but  do  not  significantly  reduce  the  system's  ability 
to  notch  out  strong  jammers.  Nany  analytical  problems  remain  to  be 
solved,  such  as  how  many  jammers  can  be  eliminated  simultaneously,  how 
deep  will  the  nulls  be  vs.  SMB,  bandwidth,  direction  of  arrival,  what 
determines  rate  of  convergence,  etc.  The  present  algorithm  requires  the 
injection  of  synthetic  noise.  A  new  scheme  without  injected  noise  is 
under  development. 

Part  C  is  a  reprint  of  a  paper  published  in  the  September  1976 
Issue  of  lEEl  Transactions  on  Antennas  and  Propaoatlon.  The  paper 
describes,  among  other  things,  work  on  the  "linear  random  search"  algorithm. 
This  adaptive  algorithm  is  by  no  means  as  efficient  as  the  IMS  algorithm 
(In  terms  of  noise  In  the  solution  weight  vector  vs.  the  speed  of  con- 
varfanca),  but  Is  fanerally  simpler  to  Implement  and  can  be  applied  to 


system  .hose  patterns  ere  adjusted  by  phase  shift  control  r.ther  then  by 
verleble  attenuators.  IHS  cen  only  be  used  In  the  letter  system,  not 
the  for.»r.  The  linear  r.ndocn  search  elporlth.,  Is  shown  to  have  oper.tlonel 
properties  simll.r  to  those  of  .  steepest  descent  edeptlm  elgorltho  which 
estlisates  gradient  components  one  at  e  time.  The  random  search  algorithm 
Is  expected  to  have  wide  applicability  and  to  be  Implementable  at  hf  and 
If  freguencles.  It  could  be  applied  to  almost  any  form  of  adjustable 
system  parameter,  from  microwave  cavity  paddles,  to  adjustable  tuning  stubs, 
to  phase  shifters,  to  attenuators,  etc.  Heny  theoretical  problem  remain 
to  be  solved,  such  as  behavior  In  system  with  mltlmodal  performance 
surfaces,  end  derivation  of  relationships  between  system  performances  versus 
speed  of  convergence.  This  Is  a  new  algorithm.  It  appears  to  be  analyzeble 
In  many  clrcumtances.  gecause  of  Its  linear  nature  and  relative  simplicity 
compared  to  other  random  search  algorithm.  It  may  become  very  widely  used. 


Part  A 

ADAPTIVE  SEPARATION  OF  SIGNALS  IN  NOISE  IN  TERNS  OF  THEIR  RELATIVE  POWER  LEVELS 
I.  Introduction 

In  this  section  we  describe  a  study  of  an  adaptive  device  which  can 
strip  off  the  coherent  signal  component  with  the  highest  power.  It  has 
two  outputs,  one  a  filtered  version  of  the  selected  component,  and  the 
other  contains  the  total  input  signal  with  the  selected  component  can¬ 
celled  out.  This  device  can  be  used  alone  to  provide  one  degree  of  A/J 
protection  (we  assume  the  Jammers  to  be  powerful),  or,  with  several  in  tandem, 
can  be  used  to  strip  off  and  rank  the  various  coherent  signal  components 
by  power.  It  may  also  be  generalized  to  an  array  configuration.  This 
section  describes  the  device,  explores  its  theoretical  behavior  briefly, 
and  presents  some  results  of  computer  simulations  of  the  device's  performance. 

i i .  Background 

The  output  of  a  receiving  antenna  array  can  often  by  modeled  as 
wideband  noise  plus  several  narrowband  signal  components  of  differing 
frequencies  and  power  levels.  The  problem  addressed  in  this  section  is 
that  of  automatically  ranking  the  coherent  components  in  order  of  their 
'"••P*ctive  powers  while  disregarding  wideband  components  such  as  noise. 

Such  a  schema  has  many  uses.  Usually  only  one  or  a  few  of  the  signal  com¬ 
ponents  are  useful.  The  others  are  not  useful  and  under  some  circumstances 
may  hinder  detection  and  estimation  of  the  desired  component.  A  signal 
ranking  schema  would  allow  the  processor  to  deal  only  with  the  desired 
component(s) .  In  other  applications  the  ranking  might  provide  information 
in  itself.  Generalized  to  an  array  configuration  it  might  be  used  to 
separate  signal  components  and  identify  the  azimuth  of  each  source.  The 


basic  approach  uses  the  "Adaptive  Line  Enhancer  [l]  (ALE)"  in  a  structure 
that  allows  it  to  strip  off  the  most  powerful  coherent  component  of  the 
input  signal  and  pass  all  the  rest.  Similar  additional  stages  could  strip 
off  the  successively  less  powerful  components  (2].  This  concept  is 
diagrammed  in  Figure  A-l.  Modifications  to  the  ALE's  adaptive  algorithm 
will  be  shown  to  itnprove  the  separation  properties. 

An  inherent  advantage  of  the  ALE  configuration  is  that  It  could 
provide  two  outputs.  One  is  the  input  signal  with  the  most  powerful 
component  subtracted  out,  thus  providing  the  input  for  the  next  stage. 

The  other  output  is  a  filtered  version  of  the  stripped  component,  allowing 
that  component  to  be  processed  independently  to  find  Its  parameters  (e.g. , 
frequency,  azimuth). 

The  Adaptive  Line  Enhancer  was  Introduced  and  described  in  reference 
(I).  Reference  [2]  describes  its  behavior  with  inputs  consisting  of  white 
noise  and  a  sinusoid.  A  diagram  of  the  ALE  is  shown  in  Figure  A-2. 

The  error  signal  e(k)  is  the  difference  between  the  input  x(k)  and 
that  signal  delayed  by  A  time  units  and  filtered  by  an  adaptive  transversal 
filter.  The  error  signal  is  used  by  the  Wldrow-Hoff  Least  Mean  Square  (LMS) 
algoirthm  to  adjust  the  weights  of  the  adaptive  fltler  to  minimize  the  error 
power  (I).  This  system's  behavior  is  best  exenplified  with  an  input  of 
a  sinusoid  plus  white  noise.  Since  the  sinusoid  is  coherent  in  time  it 
is  completely  predictable  and  a  filter  can  be  found  (via  the  adaptive 
algorithm)  ¥fhlch  filters  the  delayed  signal  to  provide  an  output  y(k)  of 
the  same  phase.  Thus  a  sinusoid  may  be  successfully  subtracted  from  the 
Input  signal  and  the  error  power  minimized  thereby.  Nowever,  since  the 
noise  is  incoherent  In  time  there  is  no  way  that  a  filtered  version  of  the 
delayed  noise  can  cancel  any  of  the  input  noise.  Thus  to  minimize  mean 
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Flgurt  1.  A  SchtiM  for  Ranking  and  Sorting  Signals  by 
Thtlr  Ralatlyt  Pontrs 


Figure  2.  The  Adaptive  Line  Enhancer  (ALE) 


square  error  the  adaptive  algorithm  must  find  a  filter  impulse  response 
which  allows  the  sinusoid  through  but  inhibits  the  noise  as  much  as  possible. 
In  fact  the  adaptive  filter  found  in  this  case  is  a  matched  filter  with 
sinusoidal  impulse  response,  which  passes  the  sinusoidal  component  but 
has  the  smallest  possible  bandwidth  to  minimize  the  noise  power  in  the 
filter  output.  The  sinusoid  and  the  noise  may  be  viewed  as  adversaries  to 
the  adaptive  porcess.  Were  the  input  just  the  sinusoid,  the  filter  would 
converge  to  a  form  which  had  a  gain  of  I  at  the  sinusoid's  frequency 
thereby  cancelling  the  sinusoid  altogether  in  t(k).  If  the  Input  were 
white  noise  only,  the  filtered  signal  would  actually  increase  the  error 
power  so  the  IMS  algorithm  turns  off  the  filter  by  adjusting  all  the  weights 
to  zero.  If  the  input  contains  both  signal  and  noise  then  the  adaptive 
algorithm  must  make  a  tradeoff  to  minimize  the  total  error  power. 

Quarterly  reports  I  and  2  12)  discuss  the  behavior  of  the  ALE  at  some 
length.  Two  significant  points  were  made. 

I)  For  an  input  consisting  of  a  single  sinusoid  of  frequency  f^  and 
white  noise,  the  convergent  filter  gain  at  frequency  f^  is  given  by; 

?  •  SNR 

a*  -  -i -  , 

I  ♦  ~  •  SNR 

where  SNR  is  defined  as  the  ratio  of  the  input  sinusoid's  power  to  that  of 
the  input  white  noise,  and  n  is  the  number  of  weights  in  the  transversal 
filter.  A  graph  depicting  a*  as  a  function  of  SNR  is  shown  lr\  Figure  A-3. 

It  may  be  seen  that  a*  >  0  as  SNR  ►  0.  Clearly  the  behavior  of  a*  l-s 
nonlinear  funcliiHi  of  SNR. 
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Figure  3.  Filter  Gein  of  the  ALE  at  Convergence  versus 
Input  Poutr 


2)  If  the  input  signal  consists  of  coherent  components  which  are 
sjifficientiy  separated  in  frequency  and  If  the  number  of  weights  n  is  large 
enoujh,  then  the  convergent  filter's  impulse  response  will  be  the  suoer- 
posltlon  of  contributions  from  each  of  the  coherent  inputs.  Each  of  these 
contributions  is  the  same  as  If  that  coherent  conponent  were  the  only 
Input.  This  property  Is  cal  led  "pseudol  Inearity".  If  the  input  is  assumed 
to  be  composed  of  several  sinusoids  plus  white  noise  It  can  be  shown  that 
a^,  the  optimal  filter  gain  at  the  frequency  of  the  i^  sinusoid,  is 
given  by: 


•t  " 


i  •  SNR. 


I  ♦  ^ 


SNR. 


where  SNRj  is  the  ratio  of  the  power  of  the  1^  input  sinusoid  to  that  of 
the  total  Input  noise.  The  fact  that  each  optimal  gain  a*j  is  not  dependent 
on  the  power  of  any  coherent  Input  other  than  the  1^  is  a  result  of  the  ALE's 
pseudol inearity. 

With  these  two  principles  it  was  shown  in  reports  I  and  2  [2]  that  the 
ALE  can  be  used  to  strip  the  most  powerful  coherent  component  out  of  an 
Input  signal.  This  may  be  seen  in  the  particular  case  where  the  input 
®®***^*®*  several  sinusoids  plus  white  noise.  For  those  sinusoidal 
components  for  which  y  •  SNR|  »  I  the  filter  gain  is  approximately  one  and 
they  ere  almost  completely  cancelled  out  of  the  error  signal  e(k).  They 
are  of  course  fully  represented  in  the  filter  output  y(k).  However  the 
conpofiants  for  which  y  •  SI1R|«  l  have  associated  filter  gains  tending 
near  taro.  As  a  result  they  appear  in  the  error  signal  and  not  in  the 
flltor  output.  The  same  is  true  of  the  broadband  noise  components.  Thus 
the  All  can  perform  a  separation  of  the  input  components  on  the  basis  of 
their  Input  powers  and  bandwldths.  The  threshold  of  separation  for 


2 

sinusoids  or  narrowband  signals  Is  given  by  -  •  SNR  -  I  or  P  -  — 

2  2  i  1  n  ’ 

where  a  Is  the  power  of  the  white  Input  noise  and  P.  is  the  power  of  the 

Ith  sinusoid  SNR.  -  P./o^.  The  threshold,  denoted  0,  is  a  function  both 

A 

of  the  input  noise  power  o  and  the  number  of  filter  weights  n.  Figure 

A-l|  shows  an  ALE  adjusted  to  slice  off  the  most  powerful  sinusoidal  com¬ 
ponent  . 


Almost  any  practical  application  of  such  level  separator  would 

require  that  the  separation  threshold  0  be  adjustable  over  a  wide  range. 

However,  as  reference  [2]  shows,  there  are  compelling  reasons  for  changing 

neither  the  Input  noise  power  or  the  tapped  delay  line  length  n.  For 

obvious  reasons  the  Input  noise  power  cannot  be  decreased.  It  can  be 

Increased  artificially  however  by  adding  extra  white  noise  to  the  ALE 

input,  but  this  has  the  disadvantage  that  the  extra  noise  propagates  on 

through  the  error  signal  Into  successive  stages.  Changing  the  filter  length 

n  changes  both  the  filter  dynamics  and  the  number  of  components  which  can 

be  handled  simultaneously  and  independently.  Usually  the  user  would  desire 

to  have  these  parameters  remain  constant.  To  allow  alteration  of  the 

threshold  level  without  changing  either  or  n,  an  alternate  separator 

was  suggested  (2].  This  processor,  the  ALE  with  Injected  noise  (ALEWIN), 

diagrammed  In  Figure  A-5,  provides  for  the  variable  threshold  of 

ft  -  2(a^*ah  ,  ,  2  ,  , 

,  leaving  a  and  n  fixed.  This  Is  accomplished  by  adding 

extra  white  noise  of  power  oj  Into  the  adaptive  filter  Input.  The  added 
noise  decreases  the  apparent  SNRs  of  the  coherent  Input  components  and 
therefore  modifies  the  power  slicing  level.  Figure  A-6  shows  the  optimal 
filter  gain  as  a  function  of  SNRj  for  the  ALEWIN.  The  term  SNRj  Is  defined 
by  P,/(o*  ♦  oj). 
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This  new  scheme  has  the  disadvantage  however  that  the  added  noise 
propagates  through  the  fiiter,  into  the  error  signai,  and  into  succeeding 
stages.  It  is  not  as  bad  as  wouid  be  the  case  if  the  noise  were  injected 
into  the  ALE  input,  but  stiii  the  effect  is  undesirabie.  This  probiem 
can  be  modified  to  some  extent  by  adding  a  "slave"  filter  which  filters  a 
delayed  version  of  the  input  but  without  its  added  noise.  This  slave 
fiiter  uses  weights  copied  from  the  adaptive  filter.  Since  no  noise  is 
actually  added  into  its  input,  the  slave  filter  output  and  the  error  signal 
formed  with  it  are  devoid  of  the  direct  effects  of  the  injected  noise. 

This  method  is  practical  and  may  have  application  in  some  situations. 

Zahm  [61,  for  example,  has  suggested  its  use  for  the  suppression  of  strong 
jamming  in  an  adaptive  beamformer  without  obliteration  of  desired  weak 
input  signals.  However  this  method  still  has  a  major  flaw  in  that  the 
injected  noise  increases  the  adaptive  filter's  "misadjustment."  The 
adaptive  algorithm  which  determines  the  weights  of  the  adaptive  filter 
produces  errors  or  noise  in  its  estimates  of  the  optimal  weights.  Weight 
noises  are  a  function,^ among  other  things,  of  the  input  noise.  In  a 
normal  wei i-designed  adaptive  processor,  weight  noises  are  tolerably  small 
and  weight  errors  are  not  a  problem.  However  in  this  case  the  large  amount 
of  injected  noise  can  cause  large  and  bothersome  noises  in  the  weights,  and 
cause  significant  amounts  of  random  modulation  of  the  fiiter  output. 

Further  research  has  shown  that  this  problem  can  also  be  solved. 

By  appropriately  modifying  the  adaptive  algorithm  used  to  adjust  the 
fiiter  weights,  behavior  similar  to  that  caused  by  the  injected  white 
noise  can  be  attained.  In  addition  the  modified  algorithm  does  not  cause 
the  increased  misadjustment  that  the  injected  noise  does,  nor  does  it 
require  the  slave  fiiter.  it  has  been  dubbed  the  "ALE  with  algorithmically 


injected  noise"  (ALEVMIN),  and  Its  operation  is  clearly  superior  to  that 
of  the  ALEWIN  [2].  The  next  section  will  describe  the  mathematical 
fomniletion  of  the  ALEUAIN  and  show  its  relationship  to  the  ALE  and 
ALEWIN. 

IN.  Theoretical  Hotivetion  for  the  ALEWAiN 

This  section  will  provide  motivation  for  ths  ALEWAIN  configuration 
by  first  analyzing  the  mechanism  of  filter  optimizi^tion  for  the  ALEWIN 
and  then  showing  that  In  expectation  the  same  effect  can  be  achieved  by 
modifying  the  algorithm.  To  proceed,  some  def initioni-.  and  background  are 
reguirad. 

Figure  A-7  is  a  block  diagram  of  the  ALEWIN.  The  input  signal,  x(k), 
pasame  Into  t»«o  paths,  one  directly  to  a  differencing  circuit,  and  the  other 
through  a  decorrelating  delay  of  A  time  units,  through  a  tapped  delay  line 
filter,  and  Into  the  other  input  of  the  differencing  circuit.  This 
difference  €(k),  termed  the  error  signal,  is  used  by  the  adaptive  algorithm 
adjust  the  filter  in  such  a  way  as  to  minimize  the  expected  power  of 
€(k).  The  following  definitions  allow  this  to  be  put  Into  mathematical 
form: 

x(k)  ■  the  input  signal 

n(k)  -  the  noise  injected  into  the  filter  input 

f(k)  -  the  actual  input  Into  the  filter  (-x(k-A)+n(k)l 

F(k)  •  (f(k)  f(k-l)  ...  f(k-n+l)J^  ■  samples  of  filter  input 

in  the  tapped  delay  line 

W(b)  •  [w^(k)  ...  -  the  impulse  response  of  the 

tapped  delay  line  filter  (also 
called  the  "weight  vector") 


Input  x(k) 


Fl9ure  7.  The  Structure  cf  the  Adaptive  Line  Enhancer  with 
Injected  Noise  (ALEHIN) 


y(k)  -  filter  output  -  W^(k)F(k) 

e(k)  -  the  error  signal  -  x(k)  -  y(k)  -  x(k)  -  W^(k)  •  r(k) 

It  will  be  assumed  that  x(k)  and  f(k)  are  random  variables  which  are 
statistically  Independent,  stationary  and  zero-mean.  The  autocorrelation 
function  V  x(k)  Is  r^(T)  and  that  of  n(k)  Is  r^(T).  The  Injected  noise. 
n(k).  Is  assumed  white.  Therefore  r^(T)  -  0^6(1),  where  aj  Is  the  power 
of  the  Injected  noise  process. 

The  filter  Impulse  response  (the  weight  vector  W(k))  Is  Iteratively 
adjusted  toward  Its  optimum  value  by  the  WIdrow-Hoff  LMS  algorithm,  which. 

A 

In  expectation,  will  reduce  e  (k)  to  Its  minimum  value.  The  LMS 
algorithm  updates  Its  estimate  of  the  optimal  weight  vector  at  each  sample 
interval  by  making  an  Instantaneous  estimate  of  the  gradient  of  the  error 
surface  and  then  moving  toward  the  minimum.  Mathematically  this  may  be 
written: 

W(k’*-I)  -  W(k)  ♦  M»e(k)»F(k),  W(0)  -  ,  ll|.| 

where  c(k)*F(k)  Is  the  magnitude  of  the  Instantaneous  gradient  estimate  and 
p,  the  adaptation  constant,  determines  how  much  the  weight  vector  will  be 
changed  In  response  to  that  estimate. 

By  making  some  substitutions  this  recursion  equation  car  be  written 
In  another  useful  form.  Note  that  by  definition  the  error  is  given  by 

c(k)  -  x(k)-F^(k)»W(k).  Substituting  this  Into  equation  1 1 1-1  and  collecting 
terms  in  W(k): 

W(k^I)  -  (l-MF(k)*F^(k)]*W(k)  ♦  p*x(k)*F(k),  W(0)  -  W  III-2 

0 

From  this  equation  W(k)  can  be  computed  iteratively  given  only  the  input 
signal  and  injected  noise. 


iS 


Suppose  the  expected  value  of  the  weight  vector  were  examined.  This 
wouid  represent  the  average  behavior  of  the  ALEWIN  in  the  statistical 
sense.  If  the  expected  values  of  both  sides  of  Eqn.  IM-2  are  taken  and 
if  it  is  assumed  that  F(k)*F^(k)  and  W(k)  are  uncorrelatedt  this  can  be 
done. 

ElW(k-H)]  -  II-i]Rpl*ElW(k)l  ♦  Elx(k)*F(k)l,  E[W(0)]  -  W(0)  - 

III-3 

where  Rp  ■  ElF (k) •F^(k)l ,  the  autocorrelation  matrix  of  the  process 

F(k).  The  tapped  delay  line  data  vector,  F(k)  can  be  written  as  the  sum 

of  X(k),  the  vector  representing  the  component  due  to  the  input  signal,  and 

N(k),  the  component  due  to  the  injected  noise.  Further  x(k)  and  n(k) 

have  baan  assumed  independent.  Therefore  E[F(k)»F^(k)l  becomes  E[X(k)‘X^(k)] 

£(H(k)‘H^(k)l  -  R^  ♦  R^,  the  sum  of  the  two  autocorrelation  matrices  for  the 
X  n 

two  separate  processes.  By  the  definition  of  the  autocorrelation  matrix 
the  ijth  element  ■  r(i-J).  Since  ■  0^6(1)  this  implies  that 

if  where  I  is  the  identity  matrix.  The  input  signal  matrix  R^  is 
not  in  general  diagonal. 

The  saaie  facts  as  above  may  be  applied  to  the  evaluation  of  the 
driving  term  E[x(k)*F(k)].  Since  F(k)  •  X(k)  ♦  N(k)  and  since  x(k)  and 
n(k)  are  independent  then  the  term  becomes  E[x(k)*X(k)]  -  the  auto¬ 

correlation  vector  of  the  process  x(k).  The  i^  element  of  is  given 
by  iPJi  ■  r^(A-H'»'l). 

Assuming  that  the  transition  term  F(k)F^(k)  and  the  weight  vector  W(k) 
are  uncorrelated  is  a  common  simplifying  assumption  in  work  on  adaptive 
signal  processors  (I,$f7l<  It  is  an  excellent  assumption  when  the  constant 
y  is  small  enough  that  adaptation  is  slow. 


with  these  observations  equation  IM-3  may  be  simplified  to  the 
fol lowing: 

E[W(k+I)l-[(l-u(R  ♦a^.|)l«EtW(k)l  +  uP  ,  E(W(0)1  -  •r-i, 

This  equation  then  describes  the  expected  behavior  of  the  weight  vector 
of  the  ALEWIN.  In  fact  If  this  equation  Is  solved  to  find  the  con¬ 
vergent  behavior  of  the  algorithm  and  If  assumptions  of  the  sort  used  :n 
a  previous  report  [2]  are  applied,  this  equation  yields  the  formulas 
obtained  In  this  report  by  the  Parseval's  Theorem  approach.  Furthermore  If 

x(k)  is  assumed  to  contain  only  white  noise  and  sinusoidal  components  and 
2 

If  is  set  to  zero,  then  Eqn.  Ill-li  converges  to  the  functional  forms 
described  In  reference  3.  Thus  equation  lll-k  describes  the  operation 
and  behavior  of  both  the  ALE  and  the  ALEWIN. 

With  this  background,  the  modified  algorithm  may  be  introduced. 
Suppose  that  the  system  used  is  exactly  as  in  Fig.  I  except  that  there  Is 
no  injected  noise.  If  so  then  F(k)  •  X(k)  and  the  recursion  expression 
for  the  expected  weight  vector  would  be: 

E(W(k+l)]  -  [l-pRjjl  EtW(k)l  ♦  pP^,  E[W(0)I  -  Wq  1 1 1-5 

As  pointed  out  In  the  previous  paragraph  this  is  simply  the  weight  vector 
recursion  for  the  ALE,  the  limiting  case  of  the  ALEWIN  as  approaches 
zero.  However  suppose  that  instead  of  the  standard  LMS  algorithm,  another 
adaptive  alporithm  (the  ••leaky”  LNS  algorithm)  is  used.  Suppose  the 
swIpAt  vector  update  equation  Is  given  by: 

W(k+I)  -  Y*W(k)  +  M«e(k)«X(k),  W(0)  -  W^  1 1 1-6 

tdwre  Y  >  0.  For  this  work  y  will  also  be  assumed  to  be  less  than  or 
equal  to  I.  The  action  of  this  algorithm  at  each  sample  instant  to  add  in 
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the  new  Instantaneous  estimate  of  the  error  surface  gradient  but  also  to 
diminish  the  weight  vector  by  a  small  factor  (causing  It  to  "leak").  The 
rationale  for  this  will  be  discussed  later. 

Suppose  that  the  weight  vector  equation  of  Eqn.  IM-6  Is  applied  to 
the  ALE  configuration.  Making  the  appropriate  changes  to  Eqn.  MI-5  then 
becomes : 

E[W(k+l)l  -  y[I-wR  1  E[W(k)l  ♦  pP  EtW(0)l  -  W.,  Ml-7a 

or,  E[W(k+l)l  -  [yl-YUR J  ElW(k)l  ♦  pP^,  EtW(0)l  -  W.,  Ill-7b 

or,  E(W(k->-l)]  -  [|-p[H^  ElW(k)l  ♦  pP^,  E[W(0)I  -  W^.  IM-7c 

But  Eqn.  Ml-7c  Is  exactly  the  same  form  as  that  of  the  normal  LMS 

Implementation  of  the  ALE  (compare  with  Eqn.  1 1 1-5)  If  I  R  ]  were 

M  X 

Interpreted  as  the  autocorrelation  matrix  of  the  tapped  delay  line  data 
vector.  However,  by  comparison  with  Eqn.  lll-A,  It  may  be  seen  that  this 
Is  exactly  the  recursion  expression  for  the  expected  value  of  the  ALEWIN 
weight  vector  If  1  -  R^.  But  -  aj  1;  therefore.  If  y  Is  chosen 

so  that  Y  ■  1  -  then  the  ALE  configuration  with  the  modified  (leaky) 
adaptive  algorithm  and  no  Injected  noise  would  have  the  same  expected 
weight  vector  as  the  ALE  with  Injected  noise  of  power  using  the  normal 
LHS  adaptive  algorithm.  Thus  modifying  the  adaptive  algorithm  has  the  same 
•*  Injecting  noise  Into  the  filter  Input.  The  line  enhancer  driven 
by  the  leaky  LHS  algorithm  Is  called  ALEWAIN. 


IV.  Discussion 


The  previous  section  shows  that  the  ALEWAIN  gives  the  same  mean 
weight  vector  as  that  of  the  ALEWIN  (but  not  the  same  variance  in  the  weight 
vector).  Whatever  value  of  oj  that  would  have  been  chosen  to  separate 
the  most  powerful  coherent  component  with  the  ALEWIN  can  be  related  to  the 
proper  value  of  Y(-I-poJ)  which  will  allow  the  ALEWAIN  to  achieve  the  same 
effect  In  the  mean.  However,  since  no  noise  Is  actually  Injected  in  the 
ALEWAIN.  It  significantly  outperforms  the  ALEWIN  and  Is  cheaper  to 
Implement.  The  ALEWIN  requires  the  actual  Injection  of  noise  Into  the 
filter  Input.  Therefore  the  filter  output  y(lc)  and  the  difference  (error 
signal)  e(k)  are  nolser  than  they  would  be  If  only  the  input  signal  were 
driving  the  adaptive  algorithm.  Since  both  the  filter  output  and  the 
difference  signal  are  desired  outputs  of  the  power  separator,  this  extra 
noise  is  deleterious.  This  problem  was  partially  eliminated  In  previous 
work  ^IJ  by  using  a  duplicate  filter,  the  so-called  "dean"  filter,  whose 
Impulse  response  Is  copied  from  the  main  filter.  Since  no  noise  Is 
Injected  Into  Its  Input,  the  output  and  ensuing  difference  signal  are  not 
corrupted  by  the  Injected  noise.  But  Its  weights  are  noisier  than  need  be 
because  the  main  filter  weights  are  made  more  noisy  by  the  Injected  noise. 

In  addition,  the  use  of  the  clean  filter  Increases  by  501  the  number  of 
nsiltlpllcatlons  required  for  each  Iteration  of  the  filter.  With  the  ALEWAIN 
the  duplicate  filter  Is  not  needed  since  no  noise  Is  actually  injected. 
Furthermore,  the  weights  of  the  ALEWAIN  are  less  noisy  for  the  same  speed 
of  convergence.  Look  again  at  the  adaptive  update  scheme  for  the  ALEWIN. 

The  estimate  of  the  gradient  of  the  error  surface  Is  -c(k)»F(k).  Suppose 
the  ALEWIN  has  converged.  If  so  then  the  expected  value  of  the  gradient 
Is  zero.  If  the  weight  vector  were  driven  by  the  true  gradient,  then  the 


weight  vector  ivould  be  unchanging  at  convergence.  However  It  Is  actually 
driven  by  an  Instantaneous  estimate  of  the  gradient.  The  more  noise 
that  Is  Injected  then  the  more  -e(k)»F(k)  will  differ  Instantaneously 
from  Its  expected  value  of  zero  at  convergence.  Thus  the  weight  vector 
will  not  stay  at  Its  optimal  value  but  will  be  perturbed  away.  This  Is  the 
nanlfestetlon  of  weight  noise  mentioned  earlier  and  In  the  case  of  the 
ALEWIN,  It  Increases  with  the  value  of  oj.  Since  the  ALEWAIN  configuration 
Injects  no  noise.  Its  weight  noise  Is  a  function  only  of  the  Input  signal 
as  wall  as  y  end  the  adaptation  constant  p. 

Soma  Insight  Into  the  equivalence  of  the  ALEWIN  and  ALEWAIN  can  be 

gained  by  discussing  the  effect  of  the  Injected  noise  on  the  algorithm. 

It  was  shown  II]  that  the  effect  of  adding  noise  to  the  filter  Input  was  to 

decrease  the  effective  SNR  of  each  Input  component.  Since  the  optimal  gain 
SNRj 

**  "  I^SNRj«n/2  **  <tocreaslng  with  decreasing  SNRj,  then 

Increasing  the  Injected  noise  has  the  effect  In  expectation  of  reducing 
the  contribution  In  the  weight  vector  from  the  less  powerful  component. 

In  terms  of  the  weight  vector  adjustment  algorithms.  Eqn.  Ill-l|,  the 
expected  weight  vector  for  the  ALEWIN  may  be  compared  with  Eqn.  1 1 1-5, 
the  expected  weight  vector  for  the  ALE.  It  may  be  seen  that  In  expectation 
they  both  have  the  same  driving  term  Thus  the  Injected  noise  contributes 
nothing  to  the  driving  term.  The  only  piece  the  Injected  noise  appears 
Is  In  the  transition  term.  It  serves  only  to  decreaso  the  magnitude  of  the 
weight  vector  at  each  Iteration.  The  higher  the  injected  noise,  the 
greater  is  the  decrease  In  the  weight  vector.  Since  n(k)  is  white,  then  in 
expectation  ell  weights  ere  decreased  equally.  Thus  P^  tends  to  Increase 


the  magnitude  of  the  weight  vector  and  tends  to  decrease  it.  The  terms 

vie,  on  the  basis  of  power,  to  determine  the  convergent  weight  vector 

magnitude.  The  vector  must  be  iarge  to  compensate  for  the  decrease  in 

the  weight  vector  caused  by  R  . 

o 

The  ALEWAIN  performs  exactly  the  same  function  by  aitering  the 
adaptive  algorithm  so  that  it  deterministical  iy  decreases  the  weight 
vector  at  each  iteration  rather  than  reiying  on  the  statistical  effects  of 
the  white  injected  noise.  To  remain  fully  represented  in  the  weight 
vector,  each  input  component  must  be  strong  enough  to  counteract  the 
effects  caused  by  y. 

This  modified  algorithm  has  been  termed  leaky  LHS  (LLHS)  since  if 
e(k)*X(k)  ■  0  then  the  weight  vector  tends  to  leak  away  to  zero  as  k  tends 
to  infinity.  It  is  also  a  good  model  of  analog  implementations  of  the  LNS 
algorithm  where  imperfect  (leaky)  integrators  are  used.  A  distinction 
should  be  drawn  here.  In  the  case  of  analog  integrators,  the  "leakiness" 
is  an  undesirable  feature  and  much  design  effort  goes  into  trying  to 
minimize  it.  However,  the  work  in  this  report  shows  that  a  controlled 
amount  of  leakiness  can  have  a  desirable  effect  in  the  application  of  signal 
separation  by  power  level.  Another  important  observation  is  that  leaky  LHS 
does  not  minimize  the  mean  square  error.  It  does  find  a  Wiener  solution 
but  for  a  performance  function  corresponding  to  an  input  which  contains 
an  artificial  additive  noise  term.  Weak  input  signals  excluded  from  the 
filter  output  y(k)  because  of  their  lesser  powers. 


IV.  Amplitude  Correction  Via  a  One-Weight  Noise  Canceller 

Reference  2  shows  the  results  of  computer  simulations  of  the  ALEWAIN. 
These  simulations  demonstrate  that  the  ALEWAIN  can  in  fact  slice  off  the 
most  powerful  sinusoid  of  an  Input  signal  and  that  its  performance  is  clearl 
superior  to  that  of  the  ALEWIN.  However,  for  the  ALEWAIN  to  function  well 
In  the  power  separation  scheme  shown  in  Figure  A-l,  not  only  must  the  most 
powerful  component  be  isolated  in  the  filter  output,  but  it  must  also  be 
completely  removed  from  the  error  signal.  If  this  signal  is  not  completely 
extracted  then  some  following  stage  might  attempt  to  Isolate  the  residual 
rather  than  the  next  lower-powered  signal.  Unfortunately  it  can  be  shown 
that  the  power  ratio  of  the  most  and  second  most  powerful  signals  must  be 
infinite  to  allow  complete  separation  in  one  ALEWAIN  stage.  To  achieve 
complete  separation  with  a  finite  ratio,  an  additional  modification  can  be 

made.  This  modification  uses  a  single-weight  Adaptive  Noise  Canceller 
(ANC)  I3]. 

A  block  diagram  of  the  adaptive  noise  canceller  is  shown  in  Figure 
A-8.  The  ANC  has  two  Inputs,  the  primary  which  contains  the  desired  signal 
plus  some  corruptive  Influence,  and  the  reference  input  which  contains 
noise  which  is  correlated  with  the  corruptive  Influence  in  the  primary 
signal.  By  adaptive  filtering  the  reference  signal,  noise  may  then  be 
subtracted  from  the  primary  signal  to  reduce  the  effect  of  the  corruptive 
Influence.  The  use  of  an  adaptive  filter  makes  this  noise  subtraction 
possible  and  practical  by  changing  the  adaptive  filter  weights  to  best 
adjust  the  spectral  content  and  phase  of  the  reference  signal,  even  if 
the  character  of  the  signal  and  the  undesIred  corruption  change  with  time. 

Using  this  concept,  the  problem  of  complete  extraction  of  the  most 
powerful  component  can  be  dealt  with.  Refer  to  Fig.  A-9.  The  input  signal 


INPUT 


Figure  9.  ALEUAIN  Modified  with  One-Height  "Noise  Cancoller” 
to  Effect  Complete  Cancellation  of  Most  Powerful 
Component 


x(k)  forms  the  primary  Input.  It  may  now  be  viewed  as  the  sum  of  many 
desirable  components  plus  an  undesIred  corruptive  signal,  the  most  powerful 
coherent  component.  The  reference  signal,  a  correlated  version  of  this  com¬ 
ponent.  can  be  supplied  by  the  ALEWAIN  filter  output  y(k).  This  concept 
is  illustrated  In  Figure  A-9.  In  the  particularly  useful  case  In  which 
the  most  powerful  component  Is  sinusoidal,  only  a  single  adaptive  weight  Is 
required  In  the  noise  canceller  stage.  This  Is  a  result  of  the  fact  that 
the  sinusoid  In  the  ALEWAIN  output  Is  In  phase  synchronism  with  the  most 
powerful  sinusoid  In  the  Input  signal.  Because  only  the  gain  (and  not  the 
phase)  must  be  adjusted,  only  one  adaptive  weight  Is  required. 

This  concept  may  be  viewed  In  another  way.  To  perform  signal  separation 
with  the  ALEWAIN  it  must  do  several  things: 

1)  Identify  the  most  powerful  component 

2)  Isolate  the  most  powerful  component  (l.e.  form  a  filter  to  pass  It) 

3)  Adjust  the  gain  of  the  most  powerful  signal  to  provide  complete 
cancel lation, 

k)  And,  minimize  the  gain  of  the  filter  for  less  powerful  components. 

Unfortunately  points  3)  end  k)  are  contradictory.  Attempting  to  make  the 
gain  of  the  adaptive  filter  equal  to  one  at  the  frequency  of  the  most 
powerful  component  has  the  simultaneous  but  undesirable  effect  of  increasing 
the  gain  for  the  less  powerful  components.  The  ALEWAIN  with  the  NC  modification 
allows  the  ALEWAIN  to  perform  functions  I,  2,  and  k,  while  the  noise  can¬ 
celler  stage  adjusts  the  isolated  component  to  have  unity  gain  for  Ideal 
cancellation  from  the  input  signal.  As  wl 1 1  be  shown  In  Section  V  the 
ALEWAIN  with  tandem  one-weight  adaptive  noise  canceller  (denoted  AKEWAIN  ♦  NC) 
performs  very  t«ell  with  narrowband  Inputs. 


V.  Experiments  Results 


To  demonstrate  the  performance  of  the  ALEWAIN  ♦  NC  configuration,  it 
was  simulated  on  an  HP  2il6B  minicomputer.  An  experiment  was  designed  to 
test  the  ability  of  the  ALEWAIN  ♦  NC  to  successfully  strip  off  the  coherent 
components  of  an  input  in  order  of  their  power.  The  input  signal  was  composed 
of  three  sinusoids  of  different  frequencies  and  powers  plus  white  noise  of 

unit  variance.  The  powers  and  frequencies  of  these  sinusoids  are  as 
follows: 


Frequency  (Hz) 

Power 

Sinusoid  II 

179. 

78.625 

, Sinusoid  #2 

312.5 

3.125 

Sinusoid  #3 

607.0 

0.125 

Forty-eight  hundred  samples  of  the  input  signal  were  stored  in  a  disk  file, 
serving  as  input  data  for  the  program  which  simulates  t^  ALEWAIN  +  NC. 

The  program  operates  by  taking  input  samples  from  a  designated  file  and  storing 
the  resulting  noise  canceller  filter  and  difference  outpits  In  additional 
disk  flies.  In  this  manner  the  same  program  can  be  used  tio  provide  several 
stages  of  separation  by  simply  specifying  the  input  file  of  the  current  run 
to  be  the  difference  signal  from  the  previous  run.  In  suci  a  fashion  the 
input  signal  was  subjected  to  three  levels  of  slicing.  At  Lach  level  the 
length  of  the  ALEWAIN  filter  n  and  the  decorrelation  delay  I  were  held 
constant  (64  and  1,  respectively).  To  achieve  separation  within  the 
desired  number  of  iterations,  however,  Pj,  and  y  were  varied  on  the 
three  runs.  The  values  actually  used  are  as  follows: 


Run  II 
Run  12 
Run  13 


^1 

ECT^  Wj 

y 

Equv.a^ 

mm 

_2UU - 

.9999375 

3.125 

:■! 

L 

- 

1 

»  Z! 

The  abbreviation  ECT  stands  for  "estimated  convergence  time,"  expressed 

in  number  of  Iterations.  The  column  to  the  right  of  that  for  Y  shows  the 

value  of  a?  which  would  be  required  by  the  equivalent  ALEWIN.  The  strategy 
A 

used  for  the  choice  of  these  operating  parameters  is  discussed  in  Section  V.I. 

In  actual  practice,  all  the  cascaded  adaptive  processors  might  be 
allowed  to  begin  adaptation  simultaneously.  For  this  simulation  however 
each  cascaded  processor  was  not  allowed  to  adapt  until  the  preceding 
processor  had  reached  convergence.  The  estimated  convergence  times  were 
used  to  determine  the  startup  tiroes.  As  noted  before,  this  Is  not  mandatory 
in  practice.  It  Is  done  here  to  uncouple  the  transient  nature  of  each  stage 
from  all  the  others  In  order  to  study  these  adaptive  transients. 

Figure  A-IO  is  a  plot  of  512  points  of  the  Input  signal  and  the 
associated  power  spectrum.  The  format  of  this  figure  will  be  used  several 
times  so  It  will  be  explained  In  detail  here.  Part  (a)  of  the  figure 
presents  a  512-point  record  of  the  signal  of  choice  (in  this  case,  the  input 
to  the  first  separator  stage).  The  middle  curve  is  the  magnitude  squared  of 
the  DFT  of  the  data  record.  This  plot  Is  scaled  by  Its  maximum  value. 

The  bottom  curve  Is  the  logarithm  of  the  middle  plot.  While  taking  the 
logarithm  presents  a  less  spectacular  picture  than  the  linear  plot  it  makes 
the  second  order  effects  such  as  input  noise  and  the  effects  of  filter 
weight  noise  more  visible.  The  arrows  below  the  frequency  axes  indicates 
the  frequencies  of  the  Input  sinusoids.  They  will  be  shown  on  all  such 
spectrum  plots,  hota  that  the  power  relationship  between  all  the  coherent 
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input  components  is  visibie  in  Figure  A-iOc. 

Figure  A-Ii  shows  the  fiiter  output  of  ALEWAIN  lf\  at  the  beginning 
**i*P^*tion.  ihe  exponentiai  growth  of  its  enveiope  is  just  as  pre¬ 
dicted  by  other  work  on  the  ALE  structure  [i].  Figure  A-i2  shows  the 
fiiter  output  signai  and  its  spectrum  weii  after  convergence  (iteration 
3000).  From  its  spectrum  it  is  clear  that  the  most  powerfui  sinusoid  has 
been  esientiaiiy  isoiated. 

Figure  A-i3  shows  the  strong  output  of  stage  #i.  Adaptation  begins 
at  iteration  500  (after  ALEVMIN  #i  converges)  and  converges  in  roughly  300 
iterations  to  provide  the  proper  scaiing  for  sinusoid  #1.  That  this 
scaiing  has  been  properiy  found  is  iiiustrated  in  Figure  A-ii».  This  is 
the  remainder  output  of  stage  #i.  At  iteration  500  (when  its  adaptation 
begins)  the  contribution  from  sinusoid  #i  begins  to  decrease  dramaticaiiy 
and  by  iteration  800  it  is  virtuaily  gone.  Figure  A-i5  further  demonstrates 
this  point  by  showing  the  remainder  signai  #i,  and  its  spectrum  beginning 
at  iteration  3000.  From  the  iog  spectrum  piot  it  is  dear  that  the  most 
powerfui  sinusoid  #i  has  been  highiy  attenuated.  In  fact  it  has  been 
decreased  by  approximately  60  dB.  (Additional  experiments  have  shown 
this  to  be  a  typical  value). 

Figures  A-16  through  -i8  show  the  behavior  of  the  second  separation 
stage,  its  Input  is  simply  remainder  output  #i  (Fig.  A-15).  Adaptation 
was  begun  at  iteration  1000  (after  the  convergence  of  NC  #i).  Figure  A-i6 
shows  the  output  of  ALEMAiN  #2  and  its  spectrum  after  convergence 
(iteration  3000).  Sinusoid  #2  appears  almost  exclusively  at  the  fiiter 
output.  Figure  A-l8  demonstrates  remainder  signai  #2  after  convergence. 
Sinusoid  #2  (and  #1,  as  well)  has  been  almost  completely  extracted.  The 
spectral  plot  shows  that  only  sinusoid  #3  and  the  input  noise  remain  in  the 
reaieinder  output  #2. 


LEWAIN  f1  AT  START  OF  ADAPTATION 


Figure  A-12.  FILTER  OUTPUT  SISBRl  FROM  AIEMAIR  II  AT  CORVERSEtICE 
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If  this  difference  output  Is  applied  to  the  third  separation  stage 
the  remaining  sinusoid  (#3)  can  be  separated  from  the  Input  noise.  This 
signal  and  Its  spectrum  are  shown  In  Figure  A-I9. 

This  experiment  demonstrates  that  the  ALEWAIN  +  NC  configuration 
can  successfully  be  used  to  effect  separation  of  narrowband  signals  on 
the  basis  of  power  when  they  do  not  overlap  In  frequency.  The  theory 
developed  so  far  (and  the  experiment  presented)  are  based  on  the  use  of 
sinusoids  as  the  coherent  Inputs.  With  finite  bandwidth  narrowband 
signals,  so  long  as  the  bandwidth  of  the  relatively  coherent  components 
Is  considerably  less  than  1/A  (A*the  decorrelation  delay),  each  signal 
separation  stage  will  slice  off  a  separate  signal.  However  It  may  be 
necessary  to  use  more  than  one  weight  In  the  NC  stage  to  effect  complete 
separation.  This  will  be  discussed  further  In  the  next  section. 

VI.  Choice  of  Operating  Parameters 

The  previous  theory  and  simulations  have  assumed  that  sufficient  a 
priori  knowledge  about  the  Input  signal  Is  available  to  the  user  so  that 
the  values  of  the  adaptation  and  leakage  constants  can  be  appropriately 
adjusted.  This  section  will  explore  some  considerations  In  their  choice. 

A.  Choice  of  the  Leakage  Coefficient  y 

The  slicing  level  of  each  separation  stage  Is  controlled  by  the 
leakage  coefficient  y  used  for  the  associated  ALEWAIN.  This  choice  Is 
determined  by  the  fact  that  the  most  powerful  component  must  be  passed  while 
less  powerful  components  must  be  suppressed  as  much  as  possible.  The  nature 
of  this  problem  can  be  understood  by  reexamining  Figure  A-6.  This  is 
the  operating  curve  of  the  ALEWIN/ALEWAIN  and  shows  the  convergent  gain 
of  the  Ij^  input  component  as  a  function  of  the  SNRj  (the  Input  SNR  as 


modified  by  the  aigori thmical ly  injected  noise).  Suppose  that  the  input 
contains  two  coherent  components  whose  power  ratio  is  I* 

goal  of  making  a*  as  smaii  as  possible  whiie  hoiding  a*  fixed  is  clea'iy 
achieved  by  operating  on  the  left  side  of  the  operating  curve  (where  the 
curve  is  increasing  approximately  iineariy).  If  the  coherent  components 
are  assumed  to  be  sinusoids,  the  maximum  possibie  ratio  of  gains  can  be 
found  analyticai iy. 
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where  o'2  {s  the  totai  equivalent  noise  power  (i.e.  oL  for  the 

A 

ALEWIN,  or  ,  for  the  ALEWAIN).  If  SNRj  and  SNR'  are  large  com- 

2 

pared  to  -  then  ^  I  and  there  is  no  gain  difference.  This  corresponds 

to  locating  both  components  on  the  right  hand  side  of  Figure  A-6.  Since 

there  Is  no  gain  difference,  separation  cannot  be  attained.  Jf,  however, 

SNRj  and  SNR'  are  both  considerably  less  than  -  ,  then: 
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This  indicates  that  the  best  separation  is  attained  by  operating  as  far 
left  as  possibie  In  Figure  A-6.  Unfortunately  this  desirable  behavior  is 
offset  by  the  fact  that  operation  in  this  region  (where  the  injected 
noise  completely  controls  the  dynamics  of  adaptation)  tends  to  disturb 
the  gain  relationships  between  the  various  parts  of  a  non-sinusoidal  input 
componont.  As  a  result  a  single  weight  noise  canceller  will  not  suffice 
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to  completely  cancel  this  component  In  the  NC  remainder  output.  This 
problem  can  be  circumvented  to  some  degree  by  adding  more  weights  to  cSe 
NC  stage.  Experiments  have  shown  however  that  generally  good  performance 

can  be  attained  for  both  sinusoidal  and  narrowband  inputs  by  setting  y 

2 

(or  o^)  so  that  a*  ,  the  gain  for  the  most  powerful  component  equals  .5. 

This  implies  that  SNRj«y-  I  and  corresponds  to  the  "knee"  or  breakpoint 
in  the  a*  vs.  SNR'  curve.  The  values  of  y  used  In  the  various  separation 
stages  of  the  experiment  shown  in  section  V  were  chosen  in  this  manner. 

In  actual  practice  the  power  of  the  most  powerful  component  Is  usually 
unknown  and  this  complicates  the  choice  of  y.  The  value  of  y  could  be 
swept  to  search  for  a  solution.  By  simply  examining  the  input  waveform  and 
attributing  Its  maximum  excursion  to  the  most  powerful  sinusoid,  the  power 
of  this  component  may  be  roughly  estimated  by  the  RMS  power  of  the  input 
Itself.  In  the  case  where  the  coherent  Inputs  have  widely  disparate  powers 
and  all  have  SNRs  greater  than  one,  this  Is  a  fairly  accurate  estimate 
and  the  value  of  y  found  from  this  estimate  will  work  well.  If  the 
conditions  are  not  satisfied,  then  the  performance  will  be  poorer.  However 
this  adaptive  structure  is  quite  tolerant  of  small  parameter  mischolces 
and  will  usually  provide  very  good  performance  even  If  the  power  is  mis¬ 
estimated.  If  the  SNR  condition  is  not  met,  then  the  background  noise 
will  determine  the  RHS  Input  power  and  hence  the  choice  of  y. 

B.  Time  Constant  Determination 

The  convergence  times  of  the  ALEMRIN  and  the  NC  are  determined  by 
and  respectively.  The  convergence  behavior  of  these  adaptive  processes 
en  be  quantified  by  finding  the  time  constants  associated  with  the  uncoupled 
modes  In  each  of  the  processes*  weight  vectors.  To  determine  this  behavior 
for  the  ALEVIAIN,  consider  again  Equation  lli-7c. 


E(w(k+I)]  -  I  +  R^]|  E(w(k)]  +  . 

Since  is  reai  and  symmetric  it  is  possibie  to  find  a  coordinate  trans¬ 
formation  which  uncouples  the  modes  of  the  adaptive  process  [8].  If  this 

is  done  the  expected  vaiue  of  a  typical  uncoupled  weight,  Wj (k)  say,  can 
be  written  [R]  as: 

E(wj(k+I)]  -  [y+u,Xj1  E[wj(k)l  +Ud,, 

where  o^  is  the  input  noise  power  and  Xj  is  the  eignevlaue  of  R^  associated 
with  the  i^  uncoupled  input  mode.  The  growth  time  constant  of  such  a 
recursion  expression  can  be  shown  to  be: 

^i  "  l-Y+u,Xj  » 

Notice  that  if  y  -  i  then  the  time  constant  degenerates  to  that  for  the  ALE. 

if  PjXj  »  I-Y  then  the  adaptive  time  constant  for  this  mode  Is  determined 

by  u,  and  the  powers  of  the  input  noise  and  the  uncoupled  coherent  component 

If,  however,  Pj  or  the  input  powers  are  so  small  that  I-y>PjXj  then  the 

adaptive  dynamics  are  determined  only  by  y.  If  l-Y>p,Xj,  for  all  I,  then 

the  ALEWAIN  becomes  a  recursive  correlator  [9]  with  a  time  constant  of  — 

1-Y 

for  all  modes.  In  the  case  of  sinusoidal  inputs  these  solutions  can  be 
put  in  terms  of  the  power  of  the  most  powerful  sinusoidal  Input.  It  can  be 
shown  [8]  that  a  sinusoidal  input  induces  two  eigenvalues  of  R^  which  are 
Wroxl,»t.l»  equal  to  ,  where  P  I,  the  power  of  the  sinusoid.  In  this 
case  the  adaptive  time  constant  associated  with  the  two  modes  of  Interest  is 
given  by: 

Convergence  of  the  adaptive  algorithm  is  a  matter  of  definition  but  a 
practical  value  is  twice  the  longest  growth  time  constant  of  interest.  There 
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convergence  time  Is 


fore,  for  a  sinusoidal  Input,  the  estimated  ALEWAIN 
given  by: 

2 

ETC.  -  - - - 

l-Y+P, 

If  the  sinusoid  Is  very  strong  then  ETC^  tends  to  l»/p,nP.  However, 

If  |i,  Is  very  small  then  ETC^  tends  to  2/(l-Y)(|.e.  the  recursive 
correlator  case). 

Determination  of  the  time  constant  for  the  one-weight  noise  canceller 
Is  quite  simple.  The  expected  behavior  of  the  weight  Is  described  by: 

Elw(k+I)l  -  (l-PjPy)  E[w(k)l  ♦ 

where  P^  is  the  expected  power  in  y(k),  the  ALEWAIN  filter  output,  and  C  Is 
the  crosscorrelatlon  of  the  primary  input  with  the  ALEWAIN  filter  output. 

In  a  fashion  simiiar  to  that  above  the  estimated  convergence  time  can  be 
shown  to  be: 


The  power  In  y(k)  Is  determined  by  the  value  of  a*  chosen  for  the  ALEWAIN 
via  y.  If  the  recommended  choice  of  a*  -  .5  Is  made  and  If  the  component 
of  interest  Is  sinusoidal  with  input  power  P  then  ETC^jj.  reduces  to: 


As  the  assumption  regarding  infinite  «>herence  (sinusoidai  Inputs)  Is 
violated  the  time  constant  estimates  given  here  become  poorer.  However 
these  estimates  v^rk  weii  in  practice  and  should  give  good  results.  Further 
Insight  Into  the  convergence  times  of  adaptive  processor  such  as  the  ALE 
and  ALEMAIN  can  be  found  in  reference  0. 


VII.  Conclusions 


This  work  has  demonstrated  In  a  preliminary  way  the  practicality  of 
the  ALEMAIN  +  NC  adaptive  processor  as  a  signal  sorter  or  separator, 
particularly  In  the  case  where  the  desired  singals  are  narrowband  and  their 
signal  “to**  noise  ratios  are  high  (  >1).  Jamming  signals  would  have  such 
high  SNR's.  Results  presented  here  have  demonstrated  how  to  design  a 
signal  separator,  how  to  choose  (either  manually  or  automatically)  the 
operating  parameters,  and  how  to  estimate  the  convergence  time  of  such 
a  processor.  These  results  should  be  very  useful  for  a  variety  of  communi¬ 
cations  and  signal  processing  applications  (I.e.,  anti-jamming) .  However 
an  equally  Important  part  of  this  research  was  the  discovery  and  preliminary 
examination  of  the  leaky  LHS  (LLHS)  adaptive  algorithm  as  applied  to 
adaptive  line  enhancing.  Even  though  much  remains  to  be  explored  about 
the  behavior  of  this  algorithm  It  seems  clear  that  It  will  have  wide 
applicability  to  sonar  and  radar  signal  processing.  In  spite  of  the  fact 
that  It  was  conceived  for  the  ALCWIN  configuration  through  evolutionary 
development.  It  can  be  generalized  to  more  complex  temporal  and  spatial 
filtering  applications. 
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ADAPTIVE  BEAMFORHING  WITH  INJECTED  NOISE 
I.  Introduction 

In  this  section  an  adaptive  antenna  array  system  with  a  special  pilot 
signal  is  proposed  and  studied.  The  goal  is  to  produce  an  antenna  system 
which  responds  to  signals  as  a  function  of  their  power  levels  —  the  stronger 
the  signal,  the  more  attenuation  desired.  A  system  of  this  type  would 
allow  reception  of  weak  signals  in  an  environment  containing  stronger,  un¬ 
desired  signals  (i.e..  Jammers). 

An  adaptive  system  is  desired  to  cope  with  the  non-stat ionary 
character  of  most  environments.  The  non-stat lonarlty  arises  from  signals 
turning  on  and  off,  signals  fading,  signal  sources  moving  in  space,  and 
possibly  the  receiving  array  changing  in  physical  orientation  (due  either 
to  being  mounted  on  a  moving  vehicle,  or  on  a  base  subject  to  stretching 
end  malformation). 

The  use  of  an  antenna  array  as  opposed  to  a  single  antenna  is  desired, 
since  this  allows  spatial  filtering  in  addition  to  frequency  filtering. 
Therefore,  the  system  can  form  nulls  In  Its  reception  pattern  In  the 
direction  of  undesired  signals. 

A  final  objective  of  the  system  is  to  maintain  reception  sensitivity 
in  directions  where  no  signals  are  currently  to  be  found.  This  allows 
for  immediate  acquisition  of  desired  signals  when  they  start  up  and  for 
essentially  unattenuated  recaption  of  low  power  signals  arriving  at  unknown 
directions  of  incidence. 

In  conclusion,  we  wish  the  adaptive  antenna  system  to  have  the  follow¬ 
ing  properties: 

0  attenuation  based  on  signal  power  strength 
2)  fast  response  to  changes  in  the  signals 


3)  response  to  changes  in  the  array  Itself 

*•)  receptivity  In  directions  where  no  signals  currently  exist. 

2.  Presentation  of  the  Aigorithm 

In  this  section  we  propose  an  adaptive  algorithm  for  use  in  an 
array  antenna  system,  designed  to  fulfill  the  functions  outlined  In  the 
previous  section.  Due  to  the  characteristics  of  the  algorithm  and  Its 
relationship  to  conventional  adaptive  beamforming  antenna  systems,  this 
adaptive  array  antenna  system  will  be  called  the  Adaptive  Beamformer 
With  Injected  Noise  (ABWIN). 

2*1  Introduction  to  the  Algorithm 

The  basic  idea  behind  the  algorithm  is  to  feed  into  an  adaptive 
antenna  array  processor  the  signals  received  from  the  environment, 
augmented  by  a  specially  chosen  pilot  signal.  The  pilot  signal  of  the 
ABWIN  is  designed  to  place  "soft  constraints"  on  the  array's  response  to 
signals;  the  intent  is  to  have  an  omnidirectional  reception  capability  In 
the  absence  of  strong  (possibly  jamming)  signals,  but  attenuating  strong 
signals  when  they  do  occur,  the  degree  of  attenuation  being  a  function  of 
the  signal  power  and  the  pilot  signal  power. 

2.2  The  ABWIN 

The  structure  of  the  adaptive  array  system  will  now  be  described;  a 
discussion  on  the  pilot  signal  will  then  be  presented. 

Figure  B.l  illustrates  the  structure  of  the  adaptive  array  system. 
Signals  are  received  from  the  environment  by  an  array  of  antenna  elements 
(the  array  geometry  Is  shown  In  the  figure  for  illustrative  purposes 
as  six  elements  in  a  circular  pattern  —  the  geometry  of  an  actual  antenna 
array  may  be  any  configuration).  Added  to  the  outputs  of  the  antenna 
elamants  are  the  Individual  components  of  the  pilot  signal  (labelled  n^ 
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Figure  B.1  Structure  of  the  ABWIN  for  a  six  element  antenna  array. 


through  In  the  figure).  The  resulting  signals  are  Inputs  to  a  set  of 
transversal  filters,  whose  outputs  are  summed  to  produce  the  array's  out¬ 
put.  This  output  Is  subtracted  fran  the  pilot  signal  producing  an  error 
signal  which  Is  used  by  the  ABWIN  for  updating  the  Impulse  response  of  the 
transversal  filters. 

We  see  that  the  array's  output  contains  the  pilot  as  well  as  the 
received  signals,  which  Is  clearly  undesirable.  To  overcome  this  problem, 
a  second  set  of  transversal  filters  Is  established  so  that  the  received 
signal  can  be  passed  through  this  set  without  the  addition  of  the  pilot 
signal.  Therefore,  one  set  is  used  for  the  adaptation  or  training  of  the 
system;  (the  training  filters)  the  second  may  be  regarded  as  an  operational 
set  of  filters  whose  output  is  the  useful  system  output.  We  will  assume 
that  the  reference  signal  is  formed  as  Illustrated  in  the  figure:  simply 
add  the  pilot  signal  components  (n^  through  n^),  and  use  the  result  as 
the  reference  signal  In  the  adaptation  algorlthnu  Other  methods  of  forming 
the  pilot  signal  are  possible  and  will  be  discussed  later.  The  adaptation 
algorithm  used  for  adjustment  of  the  transversal  filter  weights  is  Wldrow- 
Moff  Least  Mean  Square  (LMS)  algorithm  [  ),  which  will  be  discussed  In 

more  detail  In  the  next  section. 

The  pilot  signal  of  the  ABWIN  is  constructeo  In  a  special  manner. 

Each  of  the  pilot  signal  components  (nj  through  n^  on  the  figure)  is  a 
noise  signal,  generated  independently  of  the  other  pilot  signal  components, 
and  the  external  signals.  The  pilot  signal  Is  then  the  sum  of  the  pilot 
signal  components.  This  method  of  construction  gives  the  pilot  signal  the 
property  that  It  does  not  appear  to  be  arriving  from  any  specific  direction, 
unlike  the  pilot  signal  of  conventional  adaptive  beamformers  I  ].  The 


effect  of  a  pilot  signal  constructed  in  this  manner  is  described  in  a  later 
section. 

2.3  Mathematical  definition  of  the  ABWIN 

We  will  now  describe  the  ABWtN  mathematically.  Let  there  be  h 
antenna  elements.  Denote  the  output  of  sensor  i  at  time  k  by  Sj(k)  (i-l, 

...,M).  Denote  the  component  of  the  pilot  signal  added  to  the  ij^j^  sensor 
signal  by  n|(k).  Denote  their  sum  by  U|(k). 

Uj (k)  -  Sj  (k)  +  nj (k) 

Associated  with  each  element  Is  a  transversal  filter  (TF)  where  TFj  is 

associated  with  i^  element.  Each  TF  can  be  described  by  two  M  dimensional 
vectors : 

a)  the  contents  of  the  tapped  delay  line.  For  the  training  filters 

we  will  denote  the  contents  of  the  delay  elements  at  time  k  of  TFj 
by 

U,(k)  -  [uj(k)  Uj(k-l)...Uj(k-n+l)J^  . 

where  n  is  the  number  of  elements  In  the  tapped  delay  line.  For  the  oper¬ 
ational  filters  we  will  denote  the  contents  of  the  delay  line  of  TFj  at 
time  k  by 

S|(h)  -  Isj(k)  Sj(k-l)...s,(k-n*l)J^  . 

b)  the  weights  of  the  transversal  filter,  which  are  the  same  for  the 
training  filters  TFj  and  the  operational  filters  TFj .  We  will  de¬ 
note  the  weight  vector  at  time  k  of  filters  TFj  and  TFj  by: 

W,(k)  -  [w,,(k)  w,2(k)...w,  n(k)l^  . 

Using  this  notation,  the  output  of  TFj  at  time  k  is:  Vj  (k)  -  Wj^(k)Uj(k); 
and  the  output  of  TFj  at  time  k  Is:  Yj(k)  -  Wj^(k)  Sj(k). 


Si 


For  the  purpose  of  writing  the  adaptation  algorithm  In  vector 
notation,  we  need  the  following  vectors: 

U(k)  ■  the  augmented  tapped  delay  line  contents  vector  of  the  train¬ 
ing  set  of  fl  Iters 

U(k)- 


S(k)  ■  the  augmented  tapped  delay  line  contents  vector  of  the 
operational  set  of  filters  (dimension  of  M^xl) 

S(k)- 


M(k)  -  the  augmented  pilot  signal  vector  (dimensioned  H  x1) 

n 

-  X(k)  -  S(k) 


s,(k) 

S2(k) 

“T- 

Vk) 


U,(k) 

U2(k) 

U^(k) 


W(k)  ■  the  augmented  TOL  weight  vector  (dimensioned  M  xl) 


W,  (k) 


W(k)- 


|W,(k) 


Therefore,  the  output  of  the  training  filters  (which  have  the  pilot  signal) 
Is  y(k)  -  W^(k)0(k).  The  output  of  the  operational  filters  (which  contain 
only  the  sensor  signals)  Is:  y(k)  -  W^(k)S(k). 


The  operation  of  the  ABWIN  can  now  be  described  mathemat icai iy .  At 

time  k 

I)  Input  the  new  data,  shifting  the  old  data  down  the  tapped  delay 
I i nes : 

U(k)  -  iU(k-l)  +Iu(k) 

S(k)  -  iS(k-l)  +  Is(k) 
where  u(k)  -  [uj (k) . . .u^(k) 1^  -  s(k)+n(k) 
s(k)  -  Is,(k)...s^(k)]^ 

A  ■  the  tapped  delay  line  shift  matrix 
®  tapped  delay  line  Input  matrix 

These  equations  can  be  written  In  a  component  form,  as  Illustrated 
below  for  the  U(k)  vector: 
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2)  Calculate  the  system  output: 

y(k)  ■  W^(k)U(k) 
y(k)  -  W^(k)S(k) 

3)  Calculate  the  system  error  for  use  In  the  adaptation  equation: 

e(k)  -  d(k)>y(k) 

For  most  of  the  remainder  of  this  report  we  will  assume  the  pilot 

(or  "desired")  siqnal  Is  formed  as  follows: 

N 

d(k)  ■  E  n,  (k) 

1-1  ' 

Other  possibilities  will  be  discussed  In  a  later  section. 

4)  Perform  the  LMS  adaptation: 

W(k+1)  -  W(k)  ♦  2Me(k)U(k) 


where  p  Is  the  adaptation  constant. 


2.*»  The  Pi  lot  Signal 

To  this  point  very  little  has  been  sa'H  about  the  pilot  signal  and  its 
components,  but  the  functioning  of  the  ABWIN  depends  heavily  on  the  pilot 
signal . 

It  was  stated  previously  that  the  purpose  of  the  pilot  signal  in 
the  ABWIN  Is  to  place  "soft"  constants  in  the  array's  response  to  signals; 
the  intent  being  to  maintain  an  omnidirectional  reception  capability  In 
the  absence  of  signals.  To  attain  the  capability  of  omnidirectionality, 
the  reference  signal  components  (nj(k),  1-1,  M)  are  Independent  white  noises. 

Consider  a  signal  received  by  the  array  from  an  external  source. 

Such  a  signal  appears  identical  to  each  sensor,  with  the  exception  that 
the  signal  may  arrive  at  a  different  time.  Thus  there  is  a  correlation 
between  sensors  for  a  signal  with  any  spatial  orientation. 

The  ABWIN  pilot  signal  Is  different,  however.  Since  the  pilot  signal 
components  are  Independent  noises,  there  Is  no  correlation  between  the 
pilot  signal  components  on  different  sensors.  Thus,  the  pilot  signal 
component  from  one  sensor  cannot  be  used  to  cancel  any  portion  of  the  pilot 
signal  component  of  another  sensor.  Because  of  this,  the  pilot  signal 
does  not  appear  to  arrive  from  a  specific  direction.  In  other  words,  it 
does  not  exhibit  any  spatial  orientation. 

With  a  pilot  signal  created  as  described  above,  the  ABWIN  cannot 
place  a  lobe  in  a  given  direction  to  enhance  reception  of  the  pilot  signal 
as  In  conventional  adaptive  arrays.  At  the  same  time,  any  change  to  the  TF 
weights  does  affect  the  system's  response  to  the  pilot  signal.  Thus,  In 
the  absence  of  external  signals,  the  ABWIN  attains  a  reception  pattern 
which  will  be  referred  to  as  the  "quiescent"  pattern.  When  an  external  signal 
Is  received,  the  ABWIN  reacts  to  place  a  null  In  the  reception  pattern  in 
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the  direction  and  frequency  of  the  received  signal.  This  null  also  decreases 
the  gain  of  the  pilot  signal  through  the  system;  as  a  result,  the  ABWIN 
will  adjust  to  a  reception  pattern  that  "balances"  the  amount  of  the 
pilot  signal  lost  against  the  amount  of  the  external  signal  that  Is  allowed 
to  pass.  This,  then.  Is  the  "soft  constraint"  capability  of  the  ABWIN. 

3>  The  Quiescent  Pattern  of  the  ABWIN 

Let  us  now  determine  the  quiescent  pattern  of  the  ABWIN,  by 
examining  the  Impulse  response  of  the  various  transversal  filters  In  the 
absence  of  signals  received  by  the  elements. 

Consider  the  way  In  which  the  pilot  signal  Is  generated.  The  signal 
used  for  reference  purposes  Is  the  sum  of  the  Individjal  components. 
Therefore,  we  see  from  Figure  I  that  In  the  absence  of  any  external  signals, 
the  adaptive  array  can  obtain  zero  error  If  Its  output  Is  simply  the  sum 
of  the  current  Inputs  to  the  TF's.  This  Is  achieved  when  each  TF  has  a 
weight  vector  which  Is  zero  except  for  the  weight  corresponding  to  the 
most  recent  Input.  This  most  recent  Input  has  a  weight  of  I  associated 
with  It.  Thus,  the  quiescent  system  weight  vector  Is: 


In  other  words,  the  response  of  each  TF  to  a  unit  impulse  is  a  unit  impulse. 

Note  that  the  zero  weights  arise  from  the  fact  that  white  noise 
sources  are  used  for  generating  the  pilot  signal  components.  Since  white 
noise  has  no  time  correlation,  the  pilot  signal  components  from  previous 
time  samples  are  of  no  aid  in  "predicting"  the  reference  signal  at  this 
time  Instant.  Thus  a  zero  weight  is  associated  with  all  delayed  samples. 

The  weights  associated  with  the  current  inputs  must  be  unity.  This 
is  a  result  of  the  statistical  independence  of  the  individual  pilot 
signal  components.  Since  the  components  are  statistically  independent,  none 
of  the  components  is  any  aid  In  "predicting"  the  value  of  another  component. 
Thus  to  change  a  weight  from  unity  would  only  add  to  the  system  output  a 
quantity  which  could  not  be  cancelled  by  the  pilot  signal,  resulting  in 
a  non*zero  error. 

Thus  we  see  that  in  the  absence  of  external  signals,  the  ABWIN  can 
produce  a  zero  error  by  selection  of  a  unique  weight  vector,  which  has 
the  effect  of  just  summing  the  current  inputs  to  produce  the  output,  with 
no  dependence  on  past  inputs. 

This  "quiescent"  weight  vector  determines  the  "quiescent"  array 
reception  pattern,  in  conjunction  with  the  antenna  array  geometry.  This 
quiescent  pattern  is  simply  the  pattern  obtained  when  the  antenna  element 
outputs  are  directly  summed.  Thus  the  sensor  geometry  has  a  direct  effect 
on  the  quiescent  pattern  of  the  ABWIN,  but  does  not  effect  the  quiescent 
weight  vector.  A  method  of  modifying  the  pilot  signal  to  modify  the 
quiescent  weight  vector,  allowing  a  broader  choice  of  quiescent  reception 
pattern,  is  proposed  in  a  later  section. 


m 


Analysis  of  the  Convergence  Point  of  the  AtWIN 

In  this  section  we  will  .nelytlcelly  determine  the  meen  (expected 

*.'ue)  of  the  weight  vector,  et  congergence.  .egln  with  the  edaptetlon 
equation: 


W(k^i)  -  W(k)  ♦  2Me(k)U(li)  -  W(k)  ♦  2uU(k)e(k) 
Substituting  for  the  error: 


W(lc^l)  -  W(k)  ♦  2uU(k)Cd(k)-y(k)l 

-  W(k)  ♦  2uU(k)Id(k)V(k)W(k)l 

■  W(k)  ♦  2Md(k)U(k)  -  2pU(k)U^(k)W(k) 
take  the  expectation: 

£{W{ke|)}  .  t{w(k))  ♦  2u£(d{|.)u{k))  -  2ME{U(k)u\k)W(k)) 

Mow  we  make  the  approximation  Sat 


E{U(k)U  (k)W(k)}  -  E{U(k)U^(k)}  E{W(k)} 
xAlch  I.  good  for  »«||  u.  Under  this  epproximet Ion; 

E{W(ke|)}.  E{w(k))  e  2„E(d(k)U(k))  -2ME{U(k)U^k))  E{W(k)) 
At  convergence,  we  heve  E{W(k*l))  .  E(W(k))  .  which  Is  echleved  when 
E{U(k)U^(k))  E{W(k))  -  E{d{k)U(k)) 

Therefore  at  convergence. 


E(W(k)}  -  E{U(k)U^{k))"*  E(d(k)U(k)) 
for  ease  of  notation,  let 

"uu  ^  C{U(k)U^(k)} 

f  -  E{d(k)U(k)) 

C{W(k)} 


Thus  the  converged  weight  vector  satisfies 


W  - 


P. 


Now,  since 

M 

d(k)  -  I  n,(k) 
i-I  ' 

we  have 

N 

E{d(k)U(k)}  -  r.  E{n,(k)U(k)} 

I-I  ' 

However,  the  pilot  signals  are  white  and  independent  of  one  another  and  any 
external  signals.  Therefore, 


0 

0 


2 

where  a  is  the  power  of  a  pilot  signal  component:  ^  E{nj} 

***^''*'‘»  *uu  ***  decomposed  into  a  contribution  from  the  external 
signals  and  one  from  the  pilot  signals. 


"uu  -  E{IS(k)>N(k)]  IS^(kKN^(k)J}  . 


iut  the  nj(k)«s  were  constructed  to  be  Independent  of  the  Sj  (k) 's.  So 

*uu  ^  broken  into  the  sum  of  external  signal  and  pilot  signal  covariance 
matricat. 

"uu  "  ♦  E{N(k)N^(k)} 


I  ♦  R 

as  I 


nn 


£9 


Thus,  for  the  mean  weight  vector  at  convergence,  we  have 


(R  R  ) 
ss  nn' 


-I 


P 


Now,  since  the  n.  are  Independent  white  noises  of  variance  R 

'  I 

where  I  Is  an  Identity  matrix.  Thus 


W  -  (R^^+o^l)"’  P 

The  term  R^^  Itself  may  be  the  sum  of  a  set  of  matrices,  each  dependent  on 
signals  received  by  the  array. 

Me  see  that  the  converged  weight  vector  Is  a  function  of  the  ref- 
2 

erence  signal  power  a  ,  as  well  as  a  function  of  the  signals  In  the 
tapped  delay  line  (R^^).  R^^  Is  dependent  upon  both  the  time  correlation 

characteristics  of  the  external  signals  as  well  as  the  sensor  geometry. 

A  signal  with  non-zero  correlation  across  a  delay  of  one  or  more  time 
samples  will  introduce  non -zero  off-diagonal  terms  In  R^^,  due  to  the 
**PP*<I  delay  lines.  In  addition,  an  external  wavefront  received  by  the 
sensor  array  arrives  at  each  sensor  delayed  In  time  by  an  amount 
dependent  upon  the  sensor  geometry  and  the  direction  of  arrival  of  the 
Meve.  This  geometry  dependent  time  delay  affects  correal tion  between 
the  contents  of  the  tapped  delay  lines  of  different  sensors.  Thus  we 
see  that  the  cross-correlation  matrix  R  is  a  complicated  function  of 
the  statistical  characteristics  of  the  received  signals  and  the  sensor 
geometry. 

Me  tee  that  In  the  absence  of  external  signals,  R^^  ■  0,  and  the 
quiescent  weight  vector  It: 


(^0 


(o^l)"*  P 


7 


I 

0 


1 

0 


as  predicted  In  an  earlier  discussion. 


5.  Application  of  the  ABWIM 
$• I  Introduction 

In  this  section  the  ABWIN  Is  applied  to  a  particular  array  configuration 
under  several  different  signal  environments.  The  simulation  of  the  ABWIN 
Is  compared  with  results  calculated  from  the  theory  presented  earlier 
to  demonstrate  the  validity  of  the  theory.  Then  the  theory  is  used  to 
calculate  the  ABWIN  behavior  under  different  signal  conditions  to  demon- 
*trate  the  ABWIN  response  to  a  signal  on  the  basis  of  its  power. 

5.2  The  Array  Configuration 

Figure  B.2  shows  the  sensor  geometry  used  in  this  section.  The 
speed  of  signal  propagation  is  I,  the  sampling  rate  is  .12$.  and  each 
tapped  delay  line  contains  8  taps.  Thus  each  tapped  delay  line  will  hold 
I  cycle  of  a  sine  wave  of  frequency  I,  and  If  the  signal  sine  wave  (with 
frequency  I)  is  arriving  from  the  0*  direction,  the  signal  at  sensor  k  is 
shifted  I80*  In  phase  from  the  signal  at  sensor  I. 

5.3  Simulation  Nesults 

This  section  presents  an  example  where  the  converged  value  of  the 
mean  UMlght  vector  Is  computed  from  the  theoretical  results  presented 
earlier,  and  compered  with  an  actual  weight  vector  obtained  from  a  computer 
simulation  of  the  ABWIN. 
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In  this  example  two  signals  are  present.  The  first  signal  has  a 

frequency'  of  I.,  a  power  of  I.,  and  is  arriving  from  a  direction  of  i8°. 

The  second  signal  has  a  frequency  of  2.,  a  power  of  10.,  and  is  arriving 

2 

from  a  direction  of  I08*.  The  pilot  signal  component  power  (a  )  is  5. 

The  calculated  value  of  W  is  compared  below  with  a  weight  vector  obtained 
from  a  simulation  of  the  ABWIN.  Note  that  the  simulation  weight  vector 
is  an  instantaneous  weight  vector;  no  averaging  was  used  to  obtain  this 
vector. 

The  weight  vector  consists  of  48  (6x8)  elements.  The  first  eight 
elements  correspond  to  the  TF  associated  with  sensor  I  (TFj).  The  next  eight 
•laments  correspond  to  TF^  and  so  on.  The  first  weight  of  each  set  of 
eight  corresponds  to  the  weight  associated  with  the  most  recent  sample,  the 
second  to  the  next  oldest  sample,  and  so  on. 

The  theoretical  and  measured  weight  vectors  are  presented  In  Table 
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TABLE  B.  1  THEOKETICAL  AND  MEASURED  WEIGHT  VECTORS 


W  (theoretical) 

1.0358 

-  .1375 

-  .Ho? 

-  .0003 
.0508 
.0109 
.0541 
.1269 
.9228 

-  .0245 

-  .0510 

-  .1025 

-  .0005 
.1568 
.1287 

-  .0297 
.8309 

•  .0380 
.1079 
.0823 
.0160 
.0485 
.0452 

-  .0929 
1.0358 

.1269 

.0541 

.0109 

.0508 

-  .0003 

-  .1407 

-  .1375 
.9228 

-  .0297 
.1287 
.1568 

-  .0005 
>  .1026 

-  .0510 
>  .0245 

.8309 

-  .0929 
.0452 
.0485 
.0160 
.0823 
.1079 

-  .0380 


ABWIN  simulation 

1.0344 

-  .1224 

-  .1326 
.0020 
.0559 
.0015 
.0332 

.1244 

.9224 

-  .0231 

-  .0582 

-  .1103 
.0031 
.1600 
.1174 

-  .0451 
.8232 

-  .0515 
.0883 
.0735 

.0153 

.0572 

.0579 

-  .0781 

1.0269 

.1119 

.0490 

.0171 

.0494 

-  .0040 

-  .1349 

-  .1472 

.9055 

-  .0243 

.1391 

.1648 

.0048 

-  .1075 

-  .0565 

-  .0365 
.8271 

-  .0978 
.0443 
.0560 

.0288 

.0895 

.1173 

-  .0400 


As  can  be  seen,  there  is  a  very  good  agreement  between  the  theoretical 
and  actual  weight  vectors,  in  all  cases  examined  in  this  research  a  cood 
agreement  between  actual  and  theoretical  values  were  obtained. 

5.*i  Results  Calculated  form  Theory 

This  section  presents  results  in  which  the  thoery  presented  earlier 
is  used  to  calculate  the  response  of  an  ABWIN  to  a  set  of  situations  which 
allow  examination  of  the  ABWIN  performance. 

Signal  of  Frequency  ■  I.  Power  -  i.  Direction  -  i8» 

In  this  section,  a  single  signal  of  frequency  1  and  power  i  is 
••"pinging  on  the  array  of  sensors,  arriving  from  a  direction  of  l8“. 

Table  B.2  below  shown  the  gain  of  the  ABWIN  in  the  direction  of  the  signal 
as  a  function  of  the  power  of  the  pilot  signal  components.  Figures  B.3 
and  t.k  show  the  entire  antenna  pattern  at  a  frequency  of  1  for  two  of 
the  cases  in  Table  I.  The  crosshairs  on  the  figures  show  the  receiving  array 
gain  in  the  direction  of  signal  arrival.  The  stronger  the  pilot  signal, 
the  relatively  weaker  the  recievcd  signal  is,  the  more  like  a  signal  and  less 
like  a  powerful  Jammer  It  appears  to  the  system.  So,  the  stronger  the  pilot 

signal,  the  lower  the  notching  effect  seen  by  the  actual  signal  of  unit 
power. 
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Figure  B.4  Signal  of  Frequency  »  1.,  Power  --  l..  Direction  »  18“  with 
Pilot  Signal  Component  Power  «  5.  (Conditions  differ  from 
Figure  B.3  only  in  pilot  signal  component  power). 


TABLE  B.2 

ANTENNA  POWER  GAIN  AT  FREQUENCY  -  1.  DIRECTION  »  18°.  SIGNAL  POWE^  »  I 


Pilot  Signal 


2 


Signal  Power 
Gain 


This  Is  the  same  as  the  previous  situation,  except  the  power  of  the 
Incoming  signal  has  been  Increased  from  1  to  10.  Table  B.3  and  Figures 
B.5  and  B.6  present  the  results.  Once  again,  the  weaker  the  pilot  signal, 
the  greater  the  rejection  of  the  received  signal.  The  more  powerful 
recleved  signal  (of  power  lO)  Is  more  strongly  rejected  by  the  adaptive 
antenna  than  that  of  unit  power. 
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Figurt  B.5  Signal  of  Frtqutncy  •  1,  Powar  ■  10.  Direction  •  18*.  with 
Pilot  Signal  CoMpontnt  Power  -  3f.  (Sane  conditions  as 
Figure  B.3  except  the  received  signal  Is  of  greater  power  here). 
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Figure  B.6  Signal  of  Frequency  •  1,  Power  -  If.,  Direction  •  18*,  with 

Pilot  Signal  Conponent  Power  -  5.  (Same  conditions  as  Figure  B.4 
except  the  received  signal  is  of  greater  power  here.  Same 
conditions  as  Figure  8.5  except  pilot  signal  component  power  is 
less  here). 


TABLE  B.3 


ANTENNA  POWER  GAIN  AT  FREQUENCY 


DIRECTION  -  18”.  SIGNAL  POWER 


Signal  Power 
_ Gain 


IM. 

31. 

II. 

5. 

I. 

.1 


.991  X  lo"’ 
.128  X  lo"' 

X  lo"^ 
.138  X  lo"^ 
.139  X  lO"^ 


5*^*3  Signal  of  Frequency  -  2.  Power  ■  I.  Direction  -  108° 

Table  B.l»  shows  the  gain  of  the  ABWIN  to  a  signal  of  frequency  -  2, 
direction  -  ISO*,  and  power  -  I.  Figures  B.7  and  B.8  show  the  entire 
antenna  pattern  for  the  cases  of  reference  signal  component  power  ■  30 
and  5.  The  loss  of  signal  Is  roughly  similar  to  the  previous  cases 
(Fig.  B.3  and  B.l|) 
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Figur#  B.7  Signal  of  Frtqutncy  •  2,  Po*»er  -  1.  Dirtctlon  -  108“,  with 
Pilot  Signal  Coagwnent  Power  >  30.  (Slnllar  conditions  to 
Flgert  B.3  except  signal  Is  of  different  frequency  and  direction). 
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Figurt  B.8  Signal  of  Frequtncy  •  2,  Power  ■  1.  Direction  ■  108®.  with 

Pilot  Signal  Coiaponent  Power  ■  5.  (Similar  conditions  to  Figure 
B.4  except  signal  Is  of  different  frequency  and  direction. 

Sane  conditions  as  Figure  B.7  except  the  pilot  signal  component 
power  Is  weaker  here). 


-  ^ 


TABLE  B.4 


ANTENNA  POWER  GAIN  AT  FREQUENCY  -  2. 

DIRECTION  -  108!  POWER 

Pilot  Signal  . 

Signal  Power 

Component  Power  (o^) 

Gain 

IN. 

2.30 

30. 

1.09 

IB. 

.306 

5. 

.lOS 

1. 

.565  X  lo"^ 

.1 

.608  X  lo"^ 

5-^.^  Signal  of  Frequency  ■  2.  Power  ■  10.  Direction  ■  108" 

This  situation  is  the  same  as  the  previous  case  except  the  signal 
power  has  been  increased  from  1  to  10.  Stronger  signal  losses  result. 
B.5  and  Figures  B.9  and  B.IO  present  the  measured  responses. 
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Fifiuft  1.9  Slqnal  of  Froqutncy  ■  2,  PoMor  -10,  Direction  ■  108*.  with 
Pilot  Signal  Coiawfitnt  Power  -  3$.  (Similar  to  conditions 
of  Figure  B.5,  except  signal  is  of  different  freouency  and 
direction.  Same  conditions  as  Figure  B.7  except  signal  power 
is  greater  here. 


GRIN  [IN  DB  ) 


20n 


10“-.,, 

0- 

-10- 

•20- 

•30- 

-40 

-50  — 
180 


NaiSt  POWERzL 


rsfaui  I^'L^ 
WUhIH  1  fJ 


T 

0 


SIGNAL  DIRECTION 


90  180 

(  DEGREES ) 


Flfiift  1.10  SlfMl  iHtli  Fft^wticy  ■  2,  PoMtr  ■  10,  Dirtctlon  ■  106*,  with 
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(a)  Racaptlon  Pattam  at  Fraquancy  1 
Flqura  B.11  Two  algnals 

Signal  1:  Fraquancy  •  U  Pomr  -  I.,  Diractlon  ■  18* 

Signal  2;  Fraquancy  ■  2.,  Powar  ■  If.,  Diractlon  ■  108* 
with  Pilot  Signal  Conponant  Powar  -  38. 
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(b)  Reception  Pattern  at  Frequency  2 


Figure  B.11  (continued) 
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TABLE  B.S 


ANTENNA  POWER  GAIN  AT  FREQUENCY 

Pilot  Signal 
Component  Power  (o^) 

100. 

30. 

10. 

5. 

1. 

.1 


2,  DIRECTION  -  108°,  POWER  «  I? 

Signal  Power 
Gain 

.306 

.436  X  lO"' 

.565  X  I0‘^ 

.147  X  lO"^ 

.608  X  I0‘^ 

.613  X  I0‘^ 


In  this  situation  we  have  the  case  of  two  signals,  of  different  frequency 
and  different  direction,  where  one  signal  Is  much  stronger  than  the  other. 

The  ABWIN  reacts  in  such  a  way  that,  at  the  output,  signal  I  Is  stronger 
than  signal  2,  even  though  at  the  Input  It  Is  the  weaker  of  the  two.  The 
phenomenon  is  similar  to  "Inversion  of  signal  to  noise  ratios."  Table  B.6 
shows  the  results,  and  Figure  B.  11  shows  the  antenna  patterns  at  the  two 
frequencies  for  the  case  where  the  pilot  signal  component  power  «  30,  and 

Figure  B.12  is  the  analagous  figure  for  the  case  where  the  signal  component 
power  *  5. 

The  adaptive  system  handles  the  two  signals  essentially  Independently, 

somewhat  attenuates  the  weak  signal  and  strongly  attenuates  the  strong 
signal . 
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Figure  B.12  Tmo  signals 

Signal  1  ■  Frequency  ■  1.,  Power  »  1.,  Direction  ■  18* 
Signal  2  »  Frequency  •  2.,  Power  ■  10.,  Direction  ■  108* 
with  Pilot  Signal  Component  Power  •  5. 

(Same  conditions  as  Figure  B.11  except  pilot 
component  power  Is  less  here). 


TABLE  B.6 


ANTENNA  POWER  GAIN  IN  THE  DIBECTION  OF  TWO  SIGNALS: 


SIGNAL  1:  Frequency  ■ 

1. ,  Direction  -  18", 

Powc  r  ■  1 . 

SIGNAL  2:  Frequency  ■ 

2.,  Direction  ■  I08" 

,  Power  -  If. 

Pilot  Signal  . 
Component  Power 

Signal  1 

Power  Gain 

Signal  2 
Power  Gain 

IM. 

S.22 

.306 

30. 

2. 48 

.436  X  id"’ 

10. 

.694 

.565  X  lo"^ 

5. 

.239 

X 

o 

1 

N0 

1. 

.128  X  id"’ 

.608  X  10  ^ 

.1 

.138  X  lo"^ 

.614  X  lo'^ 

5*^*6  Two  Signals:  SIgna*  I;  Frequency  »l.  Power  -  10,  Direction  ■  !8° 

Signal  2;  Frequency  ■2.  Power  ■  I,  Direction  ■  108° 

This  situation  is  the  same  as  the  previous  section  except  the  power 
levels  of  the  tivo  signals  have  been  interchanged.  We  see  from  Table  B.7 
that  the  gain  of  the  antenna  in  the  signal  directions  and  frequencies 
have  also  switched,  resulting  once  again  in  greater  attenuation  for  the 
stronger  signal.  Figure  B.I3  shows  the  antenna  patterns  at  the  two 
frequencies  for  the  case  where  the  reference  signal  component  power  ■  30., 
and  Figure  B.H  is  the  analgous  case  for  the  reference  signal  component 
■  5.  Figures  B.  13  and  B.lli  may  be  directly  compared  with  Figures 
B.9  and  B.IO  respectively. 
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(•)  Rectptlon  Pattern  at  Frequency  1 
Figure  B.13  Two  Signals 

Signal  1  ■  Frequency  »  1,  Power  ■  10,  Direction  »  18“ 
Signal  2  »  Frequency  ■  2,  Power  ■  1.,  Direction  ■  108“ 
with  Pilot  Signal  Component  Power  •  30.  (Same  conditions 
as  Figure  B.ll  except  the  signal  powers  have  been  Inter¬ 
changed). 
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(b)  Rtceptlon  Pattern  at  Frequency  2 
Figure  B.13  (continued) 
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F1guf*e  B.14  Two  Signals 

Signal  1:  Frequency  ”  1.  Power  >  10,  Direction  ■  18* 

Signal  2:  Frequency  "  2,  Power  >  10,  Direction  •  IDS* 
with  Pilot  Signal  Component  Power  >  5 
(Same  conditions  as  Figure  B.12  execpt  power  of  the  two 
signals  have  been  Interchanged.  Same  conditions  as  Figure  B.13 
except  the  pilot  signal  component  pjwer  Is  less  here). 
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(b)  Reception  Pattern  at  Frequency  2 
Figure  B.14  (continued) 
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TABLE  B.7 


« 

MTENNA  POWER  GAIN  IN  THE  DIRECTION  OF  TWO  <;ir.yAic. 

f 

i 

>‘  • 
t 

Signal  1:  Frequency 

"  1 . ,  Dl rection  ■ 

l8“,  Power  ■ 

( 

Signal  2:  Frequency 

■  2. ,  Direction  ■ 

I08“,  Power  < 

f 

V 

f 

> 

1 

1 

> 

Pilot  Signal 

Component  Power 
(Noise  Power) 

Signal  1 

Power  Gain 

Signal  2 
Power  Gain 

I 

? 

IM. 

.69A 

2.30 

j 

) 

31. 

.991  X  10“' 

1.09 

1 

j 

II. 

.128  X  lo"' 

.306 

!■ 

5. 

.334  X  lo’^ 

.105 

j 

1 

1. 

.138  X  10“^ 

.505  X 

[ 

f  * 

.1 

.139  X  10*^ 

.608  X 
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5.5  Swwwiry  of  the  Simulations 


From  section  5.^  we  see  that  the  pilot  signal  component  power  has  a 
direct  effect  on  the  attenuation  an  external  signai  experiences.  An 
Important  parameter  is  ratio  of  signal  power  to  pilot  power.  The  two 
incident  signals  used  in  the  simulations  were  handled  essentially  independently 
by  the  system.  Figure  B.IS  Illustrates  how  the  recleved  signals  are 
attenuated  as  their  power  Increases.  Weak  signals  pass  while  strong  Janvners 
are  attenuated. 

The  results  of  section  5.5  supports  the  goal  of  ABWIN  response  to  a 
signal  on  the  basis  of  Its  power  level.  In  the  examples  given,  the  effect 
Is  so  strong  that  the  relative  power  levels  of  the  signals  at  the  output 
of  the  array  system  Is  reversed  from  that  at  the  Input. 

Figure  B.I5  is  a  summary  figure  which  shows  the  relationship  between 
the  gain  of  the  ABUIN  and  the  ratio  of  the  signal  power  to  the  pilot 
signal  power.  From  this  figure  we  clearly  see  the  effect  the  pilot  signal 
power  has  on  the  ABWIN' s  response  to  a  signal.  (The  data  points  for  the 
signal  of  frequency  I  was  extracted  from  Tables  B.2,  B.3,  B.6,  and  B.7. 
Similarly,  the  data  for  the  signal  of  frequency  2  was  extracted  from  Tables 
B.4|,  B.5f  B.6,  and  B.7). 

Hodlfvlnq  the  ABWIN  Pilot  Signal 

In  previous  sections  the  assumption  was  made  that  the  pilot  signal  was 
constructed  by  summing  the  Individual  pilot  signai  components  (n^.  It 
was  shown  that  this  resulted  In  a  particular  quiescent  weight  vector,  which, 
with  the  sensor  geometry,  determined  the  quiescent  reception  pattern  for 
system.  A  problem  exhibited  Itself  In  that  the  quiescent  pattern  was 
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Figure  B.15  ABWIN  gain  as  a  function  of  signal  to  pilot  signal  power 
ratio. 


the  same  pattern  obtained  by  summing  the  antenna  element  outputs.  I"* 
many  situations  the  pattern  so  obtained  may  be  unacceptable  for  the 
appi i cation. 

By  modifying  the  formation  of  the  pilot  signal,  the  guiescent  pattern 
may  be  modified,  and  thus  the  sensor  geometry  may  be  taken  into  consideration. 
Consider  the  quiescent  pattern  as  discussed  in  section  k.  In  that 

section,  it  was  proven  that  W  =  P,  where  P  =  E(d(k)U(k)},  the  correlation 

a 

between  the  pilot  signal  and  the  contents  of  the  tapped  delay  lines.  Since 
it  was  assumed  the  pilot  signal  components  (n.)  were  uncorrelated  with  the 
antenna  element  outputs,  we  see  that  the  only  significant  contents  of  the 
tapped  delay  lines  are  delayed  samples  of  the  pilot  signal  components. 

Thus,  by  changing  tha  pilot  signal  d(k)  to  correlate  differently  with  U(k), 
the  quiescent  weight  vector  W  can  be  altered. 

The  method  proposed  here  for  modifying  d(k)  is  to  allow  d(k)  to 
include  delayed  samples  of  the  n.{k).  This  can  be  accomplished  by  passing 
the  n|(k)  through  a  set  of  transversal  filters,  where  each  weight  has  as  a 
value  the  desired  correlation  of  d(k)  with  the  corresponding  element  in 
U(k). 


As  an  example,  let  us  take  a  case  of  two  sensors,  e 


-  — r'  —  ' 


If  the  pilot  signal  is  formed  as  originally  described,  we  have 


d(k) 


U(k) 


Oj  (k)  +  n2(k) 


Oj  (k) 


rij  (-1) 

02  (k) 


n2{k-l) 


inj^(k)  +  bOj (k-l)nj (k)  +  cn2(k)  n^lk)  +  dn^ (k-1 )n ^ (k) 
an, (k)n, (k-1)  +  bn,^(k-l)  +  cn^ (k)n, (k-1 )  +  dn^ (k- 1 )n , (k-1 ) 


anj(k)n2(k)  +  bn^ (k- 1 )n2 (k)  +  cn^  (k)  +  dn2 (k- 1 (k) 
an, (k)n„(k-l)  +  bn , (k- 1 ) n. (k- 1 )  +  cn, (k)n„ (k- I )  +  dn-^(k-l) 


so 


a 

b 


as  desired. 


“J 


Thus  this  method  of  formation  of  the  pilot  signal  allows  control  over  the 
quiescent  pattern  of  the  system  by  choice  of  a  suitable  quiescent  weight 
vector. 

7.  A  Proposal  for  Modification  of  the  ABWIN  Algorithm 

In  view  of  the  preceding  analysis  of  the  ABWIN,  particularly  of  the 
advantages  of  choosing  a  quiescent  weight  vector,  and  many  similarities  to 
the  ALEWIN  described  in  part  A  of  this  report,  a  modification  to  the  ABWIN 
is  proposed  here. 

In  section  A  of  this  report,  after  the  ALEWIN  is  introduced,  a  second 
type  of  line  enhancer,  the  ALEWAIN  is  introduced.  The  similarities  in  per¬ 
formance  between  the  ALEWIN  and  the  ALEWAIN  are  demonstrated.  The 
ALEWAIN  (using  the  "leaky”  LKS  algorithm)  has  the  characteristic  that  in  the 
absence  of  any  excitation  (inputs),  the  weight  vector  collapses,  of  "relaxes" 
to  zero.  In  the  ABWIN,  we  see  similar  behavior  in  that  in  the  absence  of 
external  excitation,  the  weight  vector  returns  to  its  quiescent  value. 

On  the  basis  of  this  resemblance,  the  following  algdrithm  is  proposed: 
run  the  adaptive  array  as  discussed  before,  but  without  the  pilot  signal  noise 
components  added  to  the  sensor  outputs.  As  the  error  signal,  use  the 
negative  of  the  system  output.  Then  use  the  following  rule  for  updating 
the  weight  vector: 

W(k+1)  -  W(k)  ♦  2pe(k)U(k)  -  2MY(W-W(k)) 
where  y  5s  a  constant  to  be  adjusted,  W  is  the  quiescent  weight  vector. 


Now  a  term  explictly  causing  a  relaxation  effect  is  included,  the  constant 
Y  controlling  the  magnitude  of  the  relaxation  effect. 

Preliminary  studies  indicate  that  this  algorithm  has  the  desirea 
features  of  the  ABWIN,  but  in  addition  does  not  require  a  "parallel" 
system  for  computation  of  the  system  output  y  without  the  corrupting  pilot 
signal  does  not  require  a  complicated  scheme  for  generating  the  desired 
quiescent  pattern,  and  generates  less  adaptation  noise  in  the  weights 
(thus  enabling  faster  convergence). 

8.  Conclusions 

The  Adaptive  Beamformer  With  Injected  Noise  has  been  introduced, 
and  some  analysis  has  been  undertaken.  Simulations  of  the  ABWIN  have  been 
shown  to  agree  with  the  theoretical  results.  The  formation  of  the  pilot 
signal  to  obtain  a  desired  quiescent  response  has  been  discussed,  and  a  new 
method  to  accomplish  the  same  goals  as  the  ABWIN  with  a  simpler  algorithm 
has  been  proposed  for  study. 

The  effect  of  sensor  geometry  on  the  capabilities  of  the  ABWIN  has 
not  yielded  to  analysis  at  this  time,  and  is  likely  to  be  a  problem  in 
future  studies  of  all  antenna  arrays,  adaptive  and  otherwise. 

it  is  suggested  that  the  study  of  the  new  algorithm  proposed  above 
be  pursued,  with  the  intent  of  camparing  its  performance  to  that  of  the 
ABWIN,  and  extending  analysis  of  both  algorithms  further  than  presented 
herein,  in  addition,  the  study  of  the  effect  of  sensor  geometry  on  the 
capabilities  of  these  systems  should  be  continued  as  a  background  activity. 
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I.  Introduction 

HE  APPLICATION  of  adaptive  techniques  has 
allowed  development  during  the  past  fifteen  years  of 
high-performance  receiving  antennas  with  a  capability  of 
automatically  eliminating  sidelobe  interference.  In  such 
antennas  the  main  beam  is  steered  in  a  predetermined 
direction  in  search  of  expwtcd  signals,  while  interference 
received  outside  the  main  beam  causes  the  formation  of 
nulls  in  the  radiation  pattern  [1]-[10].  New  types  of 
adaptive  antennas  are  also  currently  being  designed  that 
will  automatically  seek  and  track  desired  signals.  This 
application  promises  a  fui;ther  significant  enhancement  of 
antenna  capabilities. 

Many  a^ptive  antenna  systems  are  configured  by  con¬ 
necting  the  elements  of  an  antenna  array  to  a  multichannel 
adaptive  filter.  In  its  general  form  an  adaptive  filter  is  a 
device  that  adjusts  its  internal  parameters  and  optimizes 
its  performance  according  to  the  statistical  characteristics 
of  its  input  and  output  signals.  The  internal  filter  adjustment 
is  made  through  a  series  of  variable  settings  controlled  by  an 
adaptive  algorithm. 

liie  purpose  of  this  paper  is  to  analyze  and  compare  the 
properties  of  certain  algorithms  available  for  use  with 
adaptive  filters.  Two  basic  methods  of  adaptation  are 
considered,  those  of  steepest  descent  and  random  search. 
Theoretical  performance  comparisons  of  algorithms  based 
on  these  methods,  including  the  Widrow-Hoff  LMS 
algorithm  and  a  new  linear  random  search  algorithm,  are 
made  by  relating  quality  of  solution  to  speed  of  adaptation. 
Results  of  computer  simulations  are  presented  to  provide 
experimental  confirmation  of  the  theoretically  predicted 
performance  of  the  algorithms  and  to  illustrate  their  use  in 
adaptive  antenna  applications. 


(a) 


Fig.  t.  Adaptive  filler  consisting  of  tapped  delay  line  connected 
to  adaptive  linear  combiner,  (a)  Adaptive  filter  configuration,  (b) 
Adaptive  linear  combiner  with  input  and  output  terminology. 


by  circles  with  arrows  through  them.  The  weight  vector  W  is 
A  [>V|  »V2  •  •  •  wj^  (2) 


II.  Characteristics  AND  TerminoijOgy 
OF  THE  Adaptive  Process 

The  theoretical  analyses  of  this  paper  are  based  on  the 
particular  form  of  adaptive  transversal  filter  illustrated  in 
Fig.  1.  This  finite  impulse  response  (FIR)  filter  consists  of 
a  tapped  delay  line  connected  to  an  adaptive  linear  combiner 
that  adjusts  the  gain  of  (or  “weights”)  the  signals  derived 
from  the  delay  line  and  combines  them  to  form  an  output 
signal.'  All  of  the  algorithms  described  in  this  paper  can 
be  used  to  govern  the  operation  of  the  adaptive  linear 
combiner;  the  LMS  algorithm  is  restricted  to  this  use. 

The  input  signal  vector  Xj  of  the  adaptive  linear  combiner 
is  defined  as 

X/  -  0) 

The  input  signal  cosnponentt  aic  mmmmi  So  appear 
simultaneously  on  all  input  liacs  ai  diaeMt  limm  snieaad 
by  the  subscript  /  The  wsigfMiag  amflsMm  as  mrtiplyiug 
factors  are a^pgiiik,mi|^^*fii»Ri-  • 


The  output  yj  is  equal  to  the  inner  product  of  Xj  and  W : 

yj  =  x/w  -  IV^Xj.  (3) 

The  error  tj  is  defined  as  the  difference  between  the  desired 
response  dj  (an  externally  supplied  input  sometimes  called 
the  “training  signal”)  and  the  actual  response  yy. 

tj  ^dj-  X/  W  =  dj  -  W^Xj.  (4) 

In  adaptive  antenna  systems  the  desired  response  may  be 
derived  by  various  m^ods,  one  of  which  is  to  inject  a 
“pilot  signal”  whose  cliaracteristics  determine  the  “look” 
direction  and  frequency  response  of  the  main  beam  [4]. 
Other  methods  sue  illustrated  in  Section  VI. 

It  is  the  purpose  of  the  adaptive  process  to  adjust  the 
weights  of  the  adaptive  linear  combiner  to  minimize  the 
Hsean  square  of  the  error  tj.  Let  the  input  signals  Xj  and 
dtsiwd  response  dj  be  statistically  stationary.  During 
adaptation  the  weight  vector  varies,  so  that  even  with 
stationary  inputs  the  output  yj  and  error  tj  will  generally 
be  Boastatiooary.  Care  must  thus  be  taken  in  defining  the 
msaa  square  error  for  an  adaptive  syatem.  The  only  pos¬ 
sibility  is  an  ensemble  average,  which  can  be  established  in 
the  following  manner. 
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The  adaptive  process  progresses  recursively  or  by  iterative 
cycles.  At  the  kth  iteration  let  the  weight  vector  be 
Squaring  and  expanding  (4)  and  letting  W  =  W,,  yields 

£/  =  d/  -  2djX/W,  +  H'JXjX/tt',.  (5) 

Now  assume  an  ensemble  of  identical  adaptive  linear 
combiners,  each  having  the  same  weight  vector  at  the 
A:th  iteration.  Let  each  combiner  have  individual  inputs 
Xj  and  dj  derived,  respectively,  from  stationary  ergodic 
ensembles.  Each  combiner  will  produce  an  individual  error 
tj  represented  by  (5).  Averaging  (5)  over  the  ensemble 
yields 


=  E[d/]  -  2E[djXj^W, 

+  IVjEiXjXj^W,.  (6) 

Defining  the  vector  P  as  the  cross  correlation  between  the 
desired  response  (a  scalar)  and  the  JT-vector  then  yields 

P  4  E[djX^  *  E\d|X^|  djX2j  •  •  •  djXgjY^.  (7) 


The  input  correlation  matrix  P  is  defined  in  terms  of  the 
ensemble  average 


R  A  E[XjXj^]  =  E 


XijXtj 


XljXtj  •  •  • 


XnjXnl  J 


(8) 


(II)  in  (9)  yields  the  minimum  mean  square  error; 


^n..«  =  EW  - 

(12) 

Recombining  (12)  with  (9)  and  (11)  yields 

ik  =  +  yjRy, 

(13) 

where 

F,  A  If;  -  w*. 

(14) 

The  gradient  may  be  expressed  in  terms  of  f;  as 

V*  =  IRV^. 

(15) 

If  one  assumes  that  the  ff-matrix  is  positive  definite,  it 
may  be  expressed  in  normal  form  as  follows 

R  =  GAfi-'  (16) 

where  the  columns  of  the  square  modal  matrix  Q  are  the 
eigenvectors  of  R  and  A  is  the  diagonal  matrix  of  eigen¬ 
values.  If  G  is  constructed  to  be  orthonormal,^  then  one 
may  write 

G"'  =  G^  (17) 

Note  further  that  the  inverse  of  R  is 

JT'*  =  GA''G''  (18) 

The  mean  square  error  may  thus  be  expressed  as 

+  yjQ\Q^y,  (19) 


This  matrix  is  real,  symmetric,  and  positive  definite,  or  in 
rare  cases  positive  semi-definite.  The  mean  square  error 
can  thus  be  expressed  as 


A  £[«/]w-ir.  =  £[«//]  -  2P^W„  +  WjRW^.  (9) 


Note  that  the  mean  square  error  is  a  quadratic  function  of 
the  weigfau  that  can  be  pictured  as  a  concave  hyper- 
paraboloidal  surface,  a  function  that  never  goes  negative. 
Adjusting  the  weights  involves  descending  along  this  surface 
with  the  objective  of  reaching  its  unique  minimum  point 
(“the  bottom  of  the  bowl"  [11]).  Giiulient  methods  are 
commonly  used  for  this  purpose. 

The  gradient  ft  of  the  mean  square  error  function  with 
W'  *  is  obtained  by  dilferentiating  (9): 


V*  A 


dWt  I 

.  Jw-W, 


-2P  2RW^. 


(10) 


The  optimal  weight  vector  W*,  generally  called  the  Wiener 
weight  vector,  is  obuined  by  setting  the  gradient  to  zero; 


W*~M-'P.  (II) 

This  equation  is  a  matrix  form  of  the  Wiener-Hopf  equation 
[12H14]. 

For  the  puiposes  of  subeequent  analysis  it  is  convenient 
to  reexpicss  the  mean  square  error  function  (9)  and  the 
gradient  function  (10)  in  mojfe  compact  form.  Substituting 


A  new  set  of  coordinates  may  now  be  defined  as  follows; 

y'^Q^y=Q-'y  (20) 

and 

=  y^Q  (21) 

Substituting  (20)  and  (21)  into  (19)  then  yields 

+  y,'^\y,'.  (22) 

The  transformation  Q  projects  V  into  that  is,  projects 
V  into  primed  coordinates.  It  can  be  observed  from  (22) 
that,  since  A  is  diagonal,  the  primed  coordinates  must 
comprise  the  principal  axes  of  the  quadratic  mean  square 
error  performance  surface.  The  gradient  expressed  in 

primed  coordinates  then  becomes 

V;  =  IKV;.  (23) 

III.  The  Method  of  Steepest  Descent 

The  practical  objective  of  the  adaptive  process  is  to  find 
a  solution  to  (I  I ).  One  way  of  doing  so  would  be  by  analytical 
means.  An  analytical  solution,  however,  would  present 
serious  computational  diflSculties  when  the  number  of 
weights  was  large  or  when  the  input  data  rate  was  high. 
In  addition  to  the  inversion  of  an  n  x  n  matrix,  it  could 
require  as  many  as  /i(n  +  3)/2  autocorrelation  and  cross 
conelation  measurements  to  obuin  the  elements  of  R 
and  P.  Furthermore,  this  process  would  have  to  be  con¬ 
tinually  repeated  in  most  circumstances,  where  the  input 

*  This  can  always  be  done  whan  R  it  positive  definite. 
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signal  statistics  would  be  slowly  varying.  For  these  reasons 
it  is  more  practicable  to  make  use  of  other  recursive  statistical 
estimation  methods  in  devising  algorithms  for  use  in 
adaptive  filters. 

A  well  known  and  proven  method  for  adjusting  the 
response  of  an  adaptive  system  is  that  of  steepest  descent 
[IS],  [16],  Adaptation  by  this  method  starts  with  an 
arbitrary  initial  value  H^o  for  the  weight  vector.  The 
gradient  of  the  mean  square  error  function  is  measured 
and  the  weight  vector  altered  in  accordance  with  the 
negative  of  the  value  obtained.  This  procedure  is  repeated, 
causing  the  error  to  be  successively  reduced  and  the  weight 
vector  to  approach  the  optimal  value. 

The  method  of  steepest  descent  can  be  described  by  the 
relation 

+M-V*)  (24) 

where  /i  is  a  parameter  that  controls  stability  and  rate  of 
convergence,  and  is  the  value  of  the  gradient  at  a  point 
on  the  error  surface  corresponding  to  IF  =  IFi^.  An 
expression  for  the  gradient,  a  linear  function  of  the  weights, 
is  given  by  ( 1 5).  Substituting  this  expression  into  (24)  yields 

IF*,,  =  IF*  -  2/i«F*.  (25) 

Subtracting  IF*  from  both  sides  of  (25)  yields 

F*, ,  =  F*  -  2/iifF*  =  (/  -  2//^)F*.  (26) 

Equation  (26)  is  a  linear  homogeneous  vector  difference 
equation  whose  solution  characterizes  the  dynamic  be¬ 
havior  of  the  weight  vector  as  it  begins  at  IFg  and,  if  the 
process  is  convergent,  relaxes  toward  IF*.  The  solution  of 
(26)  is  given  by 

F*  =  (7  -  2/iJf)‘Fo.  (27) 

This  solution  is  stable  (convergent)  if 

lim  (7  -  2/iJf)*  =  0.  (28) 

k-*® 

Since 

(I  -  2^K)  =  Q(/ -  2ftA)Q-'  (29) 

and 

(7- 2,iJf)‘ =  {K/- Z/iA)*^-'  (30) 

condition  (28)  will  be  satisfied  if 

lim  (7  -  2/iA)*  =  0.  (31) 

k-*m 

Condition  (31)  will  be  met  when 


whose  solution  is 

F*'  =  (7- 2M)‘^'o'-  (35) 

For  the  pth  coordinate  one  may  write 

v^'  =  (I  -  2/i-ip)*Dpo'.  (36) 

Equation  (36)  represents  a  simple  geometric  progression  for 
Dp*',  starting  from  ti  e  initial  condition  Tpo'-  The  />th 
geometric  ratio  is 

fp  =  (I  -  2/iAp).  (37) 

An  exponential  envelope  of  time  constant  tp  can  be 
fitted  to  the  geometric  sequence  represented  by  (36).  If  the 
unit  of  time  is  one  iteration  cycle,  then 

fp  =  exp(-l/Tp)  =  I  - ^  (38) 

Tp  2!  Tp" 

In  practical  adaptive  processes  ft  is  chosen  so  that  Tp  is 
large  compared  to  one;  the  series  of  (38)  can  thus  be 
represented  by  its  first  two  terms.  Combining  (38)  with  (37) 
gives  a  formula  for  the  />th  time  constant  of  the  method  of 
steepest  descent: 


Transient  phenomena  in  the  weights,  as  seen  from  (35) 
and  (36),  are  simple  geometric  sequences  along  the  primed 
coordinates.  Along  the  original  unprimed  coordinates,  the 
same  phenomena,  represented  by  (27),  are  more  complicated. 
Transients  in  the  weights  themselves  thus  consist  of  sums 
of  geometric  sequences,  the  number  of  time  constants 
typically  being  equal  to  the  number  of  weights. 

While  transients  are  occurring  in  the  weights  as  they 
relax  toward  the  optimal  Wiener  solution,  the  nt?an  square 
error  undergoes  changes.  The  expected  error,  for  IF  =  IF*, 
is  given  by  (22).  The  weight  transients,  expressed  in  terms 
of  F*',  are  given  by  (35).  A  “learning  curve”  showing  mean 
square  error  as  a  function  of  number  of  iterations  k  can  be 
computed  by  substituting  (35)  into  (22): 

-  2/iA)“AFo'.  (40) 

As  long  as  conditions  (31)  and  (33)  are  met,  the  adaptive 
process  will  converge  on  the  minimum  point  of  the  mean 
square  error  surface: 

lim  =  5„,„.  (41) 

k-*® 


II  -  2^p|  <  I  (32) 

for p  B  1,2, ■  ■  *,n.  Since  all  eigenvalues  are  positive, 

-!-  >  P  >  0  (33) 

where  2,^,  is  the  largest  eigenvalue  of  R.  Equation  (33) 
gives  the  stable  range  for  /i. 

It  is  easily  shown  that  in  primed  coordinates  the  ntethod 
of  steepest  descent  is  represented  by 

F*',,  -  (7- 2)iA)>'*'  (34) 


The  mean  square  error  solution  starts  at  2  =  0  with  an 
initial  value  -I-  Fq^AFo',  corresponding  to  F*'  =  Fq', 
and  relaxes  toward  The  relaxation  process  is  a  sum  of 
geonwtric  sequences  whose  pth  mode  has  a  geometric  ratio 
of  (I  -  2p2p)^.  Thus  the  mean  square  error  learning  curve 
has  a  pth  mode  time  constant  of 


Learning  curves  of  computer  simulated  adaptive  processes 
will  be  presented  below. 
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If  ex«ct  gradient  meMurements  could  be  made  each 
iteration,  the  adaptive  weight  vector  would  converge  to  the 
Wiener  optimal  weight  vector.  In  reality,  however,  exact 
gradient  measureinents  are  not  possible,  and  the  gradient 
vector  must  be  estimated  from  a  limited  sutistical  sample. 
The  following  sections  describe  two  algorithms  bated  on  the 
method  of  steepest  descent  that  use  different  techniques 
to  obuin  the  necessary  gradient  estimates.  The  first  uses 
differentiation  and  requires  that  finite  perturbations  be 
made  in  the  weight  vector.  The  second,  the  LMS  algorithm, 
obuins  gradient  estimates  directly  and  without  perturbing 
or  “dithering”  the  nominal  weight  vector  adjustment. 


A.  Differential  Algorithm 


One  way  of  estimating  gradient  vectors  is  by  the  direct 
meuurement  of  derivatives.  Although  this  tedinique  is 
straightforward  and  easy  to  implement,  it  has  been  largely 
overlooked  in  the  literature  and  is  here  analyzed  in  detail. 
For  convenience  the  resuhing  algorithm  is  designated  the 
DSD  (“diflerential  steepest  descent”)  algorithm. 

1)  Gradient  esthrmtion  by  derivative  meanvement:  A 
single  component  of  the  gradient  vector  can  be  measured 
in  the  manner  illustrated  in  Fig.  2.  The  curve  representing 
the  parabolic  tnean  square  error  fimction  of  a  single 
variable  it  defined  by 


<:(ih)  A  {» =  w  +  {.ri.- 

(43) 

Its  firat  and  second  derivatives  are 

1 

i 

1 

(44) 

(45) 

The  derivatives  are  numerically  estimated  by  taking 
“symmetric  differences” : 


+  i)-  <(i>A  -  S) 


_  +  S)  -  ^  S) 

»* 


(46) 


These  finite  differences  are  exact  for  the  quadratic  {- 
function. 

The  procedure  illustrated  in  Fig.  2  requires  that  the  weight 
adjustment  be  altered  while  the  gradient  measurement  is 
being  made.  It  is  assumed  that  no  time  is  spent  at  the 
nominal  adjustment  but  that  equal  time^  is  spent  at 
j  and  Vk  -  (1.  The  result  is  that  on  the  iiverage  the 
mean  square  error  is  greater  by  an  amount  y  than  it  would 
have  been  if  the  adjustment  had  remained  at  v^.  A  per¬ 
formance  penalty  thus  results  from  the  weight  vector 
alteration. 

The  quantity  y  can  be  calculated  for  the  one-dimensional 
quadratic  {-function  as  follows: 

^  t  ».-<)■  H. 

=  (48) 

Notice  that  the  value  of  y  depends  only  on  X  and  S  and 
not  on  D^.  A  dimensionless  measure  of  how  much  the 
adaptive  system  is  perturbed  each  time  the  gradient  is 
measured,  a  parameter  that  may  be  called  the  “perturbation” 
P,  is  defined  as  follows: 


(49) 

Catai  Cnln 

This  is  the  average  increase  in  mean  square  error  normalized 
with  respect  to  the  minimum  achievable  mean  square  error. 

The  estimation  of  two-dimensional  gradients  may  now  be 
considered.  In  this  case  the  Jf-matrix  is  given  by 

It  «  poo  ^0,1 
^10  '■llj 

and  the  {-function  is 

{  -  TooP,*  +  r,,p,*  +  2roiP,P2  +  (51) 

When  the  partial  derivative  of  the  error  surface  along 
coordinate  p,  is  measured,  the  perturbation  is 

P  -  (52) 

The  perturbation  for  mearurement  along  coordinate  Vj  is 

P  =  (53) 

Assuming  that  equal  time  is  required  for  the  measurement 
of  each  gradient  component  (that  is,  that  2N  dau  samples 
are  used  for  each  measurement),  the  average  perturbation 
during  the  measurement  is  given  by 


^00  +  'll 
2 


(54) 


If  one  now  defines  a  general  perturbation  for  n  dimensions 
as  the  average  of  the  perturbations  of  the  individual 
gradient  component  measuremento,  one  obtains 


trM 

imh  n 


(55) 


(47) 


*  Ths  tkns  rsquind  to  takt  AT  data  MinpIsB. 
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Since  the  trace  of  the  i?-inatrix  is  equal  to  the  sum  of  its 
eifenvalues,  and  since  the  sum  divided  by  the  number  of 
eigenvalues  is  equal  to  the  average  of  the  eigenvalues,  the 
perturbation  may  be  conveniently  expressed  as 

f  -  (56) 

Other  means  of  gradient  measurement  have  been  used  in 
practical  systems.  A  weight  can  be  perturbed  or  dithered 
sinusoidally,  and  the  cross  correlation  between  the  weight 
value  and  the  value  of  the  performance  function  determined. 
All  weights  can  be  simultaneously  dithered  at  individual 
frequencies  and  the  gradient  components  obtained  by  cross 
correlation.  The  procedure  of  Fig.  1  corresponds  to  square- 
wave  dithering. 

2)  Gradient  measurement  noise:  Gradients  measured  in 
the  manner  shown  in  Fig.  2  are  noisy  because  they  are 
baaed  on  dilTerences  in  {-measurements  that  are  noisy. 
Each  {-measurement  is  an  estimate  based  on  N  error 
samples: 

(«) 


process  is  close  to  convergence,  and  that  the  weight  vector 
remains  near  the  minimum  point  of  the  mean  square  error 
surface,  then  the  two  components  will  have  essentially 
the  same  variances,  and  these  variances  will  be  additive. 
The  variance  in  the  estimate  of  the  derivative,  using  (46) 
and  (61),  may  be  expressed  as 

V„  W  -  _L  .  2{^(t;>  -  5) 

46H  N  N 

S  ^ .  (62) 


When  a  gradient  vector  is  measured,  the  errors  in  each 
component  arc  independent.  The  gradient  noise  vector  /V, 
may  thus  be  defined  in  terms  of  the  true  gradient  ^nd 
the  estimated  gradient  f « : 

A  V.  +  N,.  (63) 

Under  the  assumed  conditions  the  covariance  of  the  gradient 
noise  vector  is  thus  given  by 

cov  [^»]  -  ^  /.  (64) 


It  h  well  known  that  the  variance  in  an  estimate  of  the  mean 
aqnare  obtained  from  N  independent  samples  is  equal  to 
the  diffierenoe  divided  by  N  between  the  mean  fourth  and 
the  squam  of  the  mean  square.  The  variance  in  the  estimate 
of  {  may  accordingly  be  exprcaeed  as 

var  [^  -  £[!LL=i^[5L]>?.  (58) 

N 

If  Sy  is  normally  distributed  with  zero  mean  and  variance 
of  a*,  its  mean  fourth  is  y&*,  and  the  square  of  its  mean 
square  is  &*.  The  variance  in  the  estimate  of  {  is  thus 

v«r[e]-i(3e*-0-^-^.  (59) 

Note  that  the  variance  is  proportional  to  the  square  of  { 
and  inversely  {voportional  to  the  number  of  data  samples. 
It  can  thus  in  gennal  be  expressed  at 

iV 

where  K  has  a  value  of  2for  am  unbiased  Gaussian  probability 
dtess^.  If  the  probability  density  is  other  than  Gaussian,  the 
value  of  K  is  generally  Im  than  but  dose  to  two.  It  is  thus 
assumed  for  the  purposes  of  subsequent  analysis  that 

varCe]-^.  (61) 

The  derivatives  requind  by  the  gradient  estimation 
tedinique  of  Fig.  2  are  measured  in  accordance  with  (46). 
The  error  in  the  derivative  estimate  will  be  a  sum  of  two 
components  that,  since  the  samples  of  the  error  Sj  are 
aasumd  to  be  independent,  will  also  be  independent.  The 
variance  of  each  component  is  determined  by  (61).  If  it  is 
■esnined  that  the  perturbation  P  is  small,  that  the  adaptive 


It  is  also  useful  to  obtain  an  expression  for  the  covariance 
of  the  gradient  noise  vector  in  primed  coordinates : 

/V/  -  Q-'N,.  (65) 

Since  the  covariance  matrix  of  is  scalar,  projecting  into 
primed  coordinates  through  the  orthonormal  transforma¬ 
tion  yields  the  same  covariance  for  N^': 

cov  [^/]  =  E[Q- 'N,N,^Q-]  =  ^  /.  (66) 

Near  the  minimum  point  of  the  mean  square  error  surface 
the  covariance  of  the  gradient  noise  is  essentially  constant 
and  not  a  function  of 

3)  Noise  in  the  weight  vector:  Adaptation  based  on  noisy 
gradient  estimates  results  in  noise  in  the  weight  vector. 
The  method  of  steepest  descent  with  ideal  gradients  is 
represented  by  (26).  With  estimated  gradients  this  equation 
may  be  rewritten  as 

M-V*)  =  F.  +  M-V.  -  N,).  (67) 

Substituting  (15)  and  combining  terms  yields 

=  (/- -/iAT.  (68) 

a  firet-order  vector  difference  equation  with  a  stochastic 
driving  function  of  -uNi,.  Projection  into  primed  co¬ 
ordinates  may  be  accomplished  by  premultiplying  both 
sides  of  (68)  by  : 

f;„  -  (/ -  2M)F»' - (69) 

In  steady  state,  after  initial  adaptive  transients  have  died 
out,  y^  undergoes  a  stationary  random  process  in  response 
to  the  stationary  driving  function  -gNi,'.  Since  there  is 
no  cross  coupling  between  terms  and  the  components  of 
N,’  are  mutually  uncorrelated,  the  components  of  F^' 
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will  also  be  mutually  uncorrelated,  and  the  covariance 
matrix  of  will  be  diagonal.  To  find  this  matrix  one  first 
multiplies  both  sides  of  (69)  by  their  own  transposes: 

=  (/  -  -  2tiA) 

+  -  1^1  _  2/iA)F*'Ar*'^ 

-  -  2ti\).  (70) 

Taking  expected  values  of  both  sides  yields* 

cov  [P;]  =  (/  -  2/iA)  cov  [F*'](/  -  2/iA) 

+  /i*  cov  [A/;'].  (71) 

Combining  terms  further  yields 

cov  [  F*']  =  ;i  VfA  -  4/i^A*)-  ‘  cov  [AT/].  (72) 

In  practical  circumstances  the  method  of  steepest  descent 
is  implemented  with  a  small  value  of  n,  so  that 


Thus  (77)  can  be  rewritten 


£[P-AF.1  -  ,7„ 

A  useful  parameter  in  the  design  of  adaptive  processes 
is  the  misadjustment  M,  which  is  defined  as  the  average 
excess  mean  square  error  divided  by  the  minimum  mean 
square  error: 


M  4 


(min 


(80) 


The  misadjustment  is  a  dimensionless  measure  of  the  dif¬ 
ference  between  adaptive  performance  and  optimal  Wiener 
performance  as  a  result  of  gradient  noise.  In  other  words, 
it  is  a  measure  of  the  cost  of  adaptability. 

Using  (79)  one  can  express  the  misadjustment  for  the 
DSD  algorithm  as  follows: 


HA  «  /.  (73) 

Neglecting  the  squared  terms  in  (72)  thus  yields 


cov[F/]  -  jA-‘cov[Ar/].  (74) 

Using  (66)  one  may  now  write 


cov[F.'] 


(75) 


The  components  of  F*'  are  mutually  uncoirelated  but  not 
all  of  the  same  variance.  The  covariance  of  F*  can  be 
obuined  from  (75)  by  using  (18)  and  (20); 


4NS* 


(81) 


This  formula  is  simple  and  clear  but  can  be  more  usefully 
expressed  in  terms  of  time  consunu  of  the  learning  process 
and  the  perturbation  of  the  gradient  estimation  process. 

Each  gradient  component  measurement  uses  2N  samples 
of  data.  Each  iteration  involves  n  gradient  component 
OKasurements  and  therefore  requires  2Nh  dau  samples. 
The  rime  constant  is  given  by  (42)  in  number  of 
iterations,  a  basic  “unit  of  time.”  If  one  now  defines  a  new 
time  consum  whose  basic  unit  is  the  data  sample  and 
whose  value  is  expressed  in  number  of  data  samples,  then 
for  the  DSD  algorithm 


cov  [F*]  =  £[<?F/F/^C-‘]  -  ^  Jf-‘.  (76) 

4Nd 

4)  Misadjustment:  Without  noise  in  the  weight  vector, 
adaptation  by  the  method  of  steepest  descent  would  con¬ 
verge  to  a  steady-state  solution  at  the  minimum  point  of 
the  mean  square  error  surface.  The  mean  square  error 
would  therefore  be  Noise  in  the  weight  vector,  however, 
tends  to  cause  the  steady-sUte  solution  to  vary  randomly 
about  the  minimum  point— that  is,  to  "climb  the  sides  of 
the  bowl.  The  result  is  an  “excess"  mean  square  error,  a 
mean  square  error  that  is  greater  than 
An  expreuion  for  mean  square  error  in  terms  of  F' 
is  given  by  (22),  where  the  excess  mean  square  error  is 
AFj'.  The  average  excess  mean  square  error  is 


T’n™.  4  2nNx^.  (82) 

The  new  time  consUnt  is  easily  related  to  real  rime  if  the 
sampling  rate  is  known. 

Using  the  perturbation  formula  (56)  one  can  reexpress  the 
misadjustment  for  the  DSD  algorithm  (81)  as 


4NP 


(83) 


Using  (42)  the  time  consUnt  defined  by  (82)  can  also  be 
reexpressed  as 


which  is  equivalent  to 


nN 

2ld, 


(84) 


-  i  A^[(o^')*]. 

From  (75)  one  may  write 


£[(e^')*] 


(77) 


(78) 


Combining  (86)  with  (83)  shows  the  misadjustment  to  be 


M  - 


m 
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For  the  DSD  algorithm,  misadjustment  is  thus  pro¬ 
portional  to  the  square  of  the  number  of  weights  and 
inversely  proportional  to  the  perturbation.  It  is  also  in¬ 
versely  proportional  to  the  speed  ol  adaptation;  that  is, 
fast  adaptation  results  in  a  high  misadjustment.  More 
specifically,  the  misadjustment  s  dependent  on  the  average 
reciprocal  time  constant  of  thi  ^  learning  curve  whose  time 
base  is  calibrated  in  number  of  rata  samples.  Note  that 
very  fast  modes  may  dominate  t!  is  average  and  cause  an 
increase  in  misadjustment,  while  the  rate  of  convergence 
will  remain  limited  by  the  slowest  mode.  In  other  words, 
with  disparate  eigenvalues  in  the  ^-matrix,  the  adaptive 
process  may  be  afflicted  wjth  the  misadjustment  its 
fastest  modes  but  may  converge  only  at  the  rate  of  its 
slowest  modes.  With  equal  or  closely  similar  eigenvalues, 
the  process  is  more  efficient,  and  the  misadjustment  is 
given  by 


(88) 


In  this  case  the  learning  curve  has  only  one  time  constant, 
T 

Misadjustment  as  defined  here  is  a  normalized  per¬ 
formance  penalty  resulting  from  noise  in  the  weight  vector 
and  is  a  stochastic  effect.  In  an  actual  adaptive  system, 
where  the  weight  vector  is  deterministically  perturbed  to 
measure  the  gradient,  another  penalty  accrues,  the  perturba¬ 
tion,  also  a  ratio  of  excess  mean  square  error  to  minimum 
mean  square  error.  The  total  excess  mean  square  error  can 
be  shown  to  be  the  sum  of  the  “stochastic”  and  ‘^leter- 
ministic"  components.  The  total  misadjustment  is  thus 


A  A/  -(■  /».  (89) 


use  is  restricted  to  the  adaptive  linear  combiner  of  Fig.  I , 
where  inputs  Xj  and  dj  are  given. 

1)  Gradient  estimation,  convergence,  time  constants:  The 
error  Cj  of  the  adaptive  linear  combiner  of  Fig.  1  is  given 
by  (4).  A  gradient  estimate  may  be  obtained  by  squaring 
the  single  value  of  Ej  and  differentiating  it  as  if  it  were  the 
mean  square  error: 


Substituting  (93)  into  (24)  yields  the  LMS  algorithm : 

Wj,,  =^Wj  +  IgEjXj.  (94) 

Since  a  new  gradient  estimate  is  obtained  with  each  data 
sample,  an  adaptive  iteration  is  effected  with  the  arrival  of 
each  sample.  The  index  k  is  thus  replaced  with  the  index  / 
The  gradient  estimate  of  (93)  may  be  implemented  in  a 
practical  system  without  further  squaring,  averaging,  or 
differentiation  and  is  elegant  in  i(s  simplicity  and  efficiency. 
All  components  of  the  gradient  vector  are  obtained  from  a 
single  data  sample  without  perturbation  of  the  weight  vector. 
Since  the  estimate  is  obtained  without  averaging,  it  contains 
a  large  component  of  noise.  The  noise,  however,  is  averaged 
and  attenuated  by  the  adaptive  process,  which  acts  as  a 
low-pass  filter  in  this  respect.  It  is  important  to  note  also 
that  for  a  fixed  value  of  W  the  estimate  is  unbiased; 

=  -2£[cyA'J  =  -2E[djXj  -  XjX/W]. 

(95) 


Adding  these  components  yields 


The  perturbation  is  a  design  parameter.  Its  chdee  is 
optimiz^  by  differentiating  (90)  with  respect  to  P  and 
setting  the  derivative  to  zero.  The  result  is  to  make  the  two 
right-hand  terms  of  (90)  equal.  The  optimal  perturbation 
is  thus 


From  (10),  the  formula  for  the  true  gradient,  this  expression 
can  be  rewritten  as 

E\yj]  ~  -2(1*  -  RW)  -  V.  (96) 

Proofs  of  convergence  of  the  LMS  algorithm  have 
appeared  in  the  literature  [4],  [II],  [17]-[20].’  These 
proofs  show  that  the  algorithm  is  sta^  when 

1/4m  >  /I  >  0  (97) 


Pm  -  (91) 

and  the  minimum  total  miudjustment  is 


The  use  of  the  above  misadjustment  formulas  in  the 
design  of  adaptive  systems  will  be  illustrated  in  Section  V 
below. 

B.  LMS  Algorithm 

The  LMS  algorithm  is  an  implementation  of  the  method 
of  steepest  descent  that  employs  a  gradient  estimation 
technique  more  efficient  than  derivative  measurement.  This 
alforitto,  however,  is  not  universally  api^bie,  and  its 


which  is  the  same  as  the  condition  for  stability  of  the  method 
of  steepest  descent  in  general,  given  by  (33).  It  is  also  shown 
in  [4]  and  [1 9]  that  the  time  constants  of  the  LMS  algorithm 
are 


which  are  similarly  identical  to  the  time  constants  for  the 
method  of  steepest  descent,  given  by  (42).  Once  again,  x, 
is  the  time  constant  of  the  pth  mode  for  transient  phenomena 
in  the  weights,  while  is  the  corresponding  time  constant 
of  the  learning  curve.  Since  only  one  data  sample  per  itera- 

*  For  input  vseiors  Xt  mutually  uncorrdatsd  over  time:  prooCi  for 
oorreiatad  input  vectors  nave  ban  develoiied  in  (21)  and  (22). 
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tion  is  used,  the  time  constant  expressed  in  number  of  data 
samples  is 

=  Tp,...-  (99) 

2)  Gradient  measurement  noise:  Let  it  be  assumed  that 
the  adaptive  process,  using  a  small  value  of  the  adaptive 
constant  fi,  has  converged  to  a  steady  state  near  the 
minimum  point  of  the  mean  square  error  surface  defined 
by  (9).  The  gradient  estimation  noise  of  the  LMS  algorithm 
at  the  minimum  point,  where  the  true  gradient  is  zero,  is 
the  gradient  estimate  itself : 

Nj  -  V,  =  -IcjXj.  (100) 

The  covariance  of  this  noise  is  given  by 

cov  [^J]  =  =  AE[tj%X;i  (101) 

It  is  well  known  from  Wiener  filter  theory  that,  when  the 
weight  vector  is  optimized  (that  is,  when  =  If'*),  the 
error  is  uncorrelated  with  the  input  vector  X,.  If  one 
assumes  that  £y  and  Xj  are  Gaussian,  not  only  are  they 
uncorrelated  at  the  minimum  point  of  the  error  surface 
but  also  statistically  independent.  Under  these  conditions 
(101 )  becomes 

cov  [Nj]  =  AE[c/-\E[XjXj^  =  (102) 

In  primed  coordinates  the  covariance  is 

cov  [A^y']  =  Q-'  cov  [^J]Q  =  (103) 

3)  Nobe  in  the  weight  vector;  Equations  (67)-(74)  above 
apply  to  the  method  of  steepest  descent  with  any  means  of 
gradient  estimation  that  results  in  a  diagonal  covariance 
matrix  for  N/ — that  is,  to  both  the  DSD  algorithm  and  the 
LMS  algorithm.  For  the  LMS  algorithm,  using  (74)  and 
(103),  one  may  write 


CO  V  [  Fy']  =  J  A  - '  (4{„,„A)  =  ( 104) 

The  covariance  of  the  steady  state  noise  in  the  weight 
vector  (at  or  near  the  minimum  point  of  the  mean  square 
error  surface)  is 

cov  [Fy]  = (105) 

4)  Misadjustment:  Fu.’’  the  LMS  algorithm  the  misad- 
justment  M,  defined  by  (?0),  may  be  found  as  follows. 
The  average  excess  mean  square  error,  given  by  (77),  may 
be  written  as 


i  i  A, 

p“>  p-i 

(106) 


where,  according  to  (104),  £[(i>,y')^]  =  for  all  p. 
The  misadjustment  is  thus  given  by 


M 


p  tr  R. 


(107) 


•min 


This  useful  formula  may  be  reexpressed  in  a  manner  that 
allows  one  to  perceive  the  relationship  between  misadjust¬ 


ment  and  rate  of  adaptation.  According  to  (98)  one  may 
write 


= 


4t. 


and 


/itrif  =  1  i  -L  =  ”(-L)  . 


The  misadjustment  may  thus  be  written 


(108) 

(109) 


It  is  interesting  to  compare  (110)  with  (87),  the  misadjust¬ 
ment  formula  for  the  DSD  algorithm.  Once  again  it  is 
apparent  that  misadjustment  is  reduced  by  slow  adaptation, 
by  making  the  values  of  where  p  =  1, •  •  •,«,  large. 
With  the  LMS  algorithm,  however,  for  a  given  value  of 
misadjustment,  the  adaptive  time  constants  increase  linearly 
with  the  number  of  weights  rather  than  with  the  square  of 
the  number  of  weights.  Furthermore,  there  i.-;  no  perturba¬ 
tion.  In  typical  circumstances  much  faster  adaptation  is 
thus  possible  than  with  the  DSD  algorithm,  as  will  be  borne 
out  by  the  numerical  examples  presented  in  Section  VI. 

It  may  also  be  observed  from  (110)  that  the  LMS 
algorithm,  since  it  is  based  on  the  method  of  steepest 
descent,  suffers  like  the  DSD  algorithm  when  there  is  a 
great  disparity  in  the  eigenvalues  of  R.  Under  such  con¬ 
ditions  misadjustment  once  again  can  be  dominated  by 
the  fastest  modes  (those  with  the  smallest  time  constant 
'»'l>'lc  of  convergence  can  be  limited  by  the 
slowest  modes. 

When  the  eigenvalues  are  equal,  a  useful  formula  for  the 
misadj  jstment  of  the  LMS  algorithm  is 


(111) 


Experience  has  shown  this  formula  to  be  a  good  approx¬ 
imation  of  the  relationship  between  misadjustment,  time 
constant  of  the  learning  curve,  and  number  of  weights  even 
when  the  eigenvalues  are  not  equal.  Such  a  relationship  is 
needed  in  designing  an  adaptive  system  when  the  eigenvalues 
are  unknown. 

Since  trace  R  is  the  total  power  of  the  inputs  to  the  weights, 
which  is  generally  known,  one  can  use  (107)  in  choosing 
a  value  of  p  that  will  produce  a  desired  value  of  M.  One 
can  accordingly  combine  (1 1 1)  and  (107)  to  obuin  a  general 
formula  for  time  constant  of  the  learning  curve  with  equal 
eigenvalues: 


4/1  tr  R 


(112) 


This  formula  is  also  a  good  approximation  in  many  cases 
when  the  eigenvalues  of  R  are  unequal. 


IV.  Random  Search 

The  method  of  steepest  descent  is  a  systematic  surface¬ 
searching  procedure.  Although  randomness  enters  in 
practice  through  gradient  estimation  noise,  adaptation  by 
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this  method  is  basically  a  deterministic  process.  Random 
search,  by  contrast,  seeks  to  improve  performance  by 
making  random  changes  in  system  parameters.  A  simple 
algorithm  based  on  this  method,  inspired  by  the  Darwinian 
concept  of  evolution,  may  be  called  random  search  by 
“natural  selection.”  Though  derived  from  a  natural  model 
this  algorithm  appears  to  offer  a  practical  approach  to  the 
adaptive  process  that  may  have  engineering  merit  [23]. 

In  random  search  by  natural  selection  a  random  change 
is  made  in  the  weight  vector  of  an  adaptive  processor,  such 
as  the  linear  combiner  of  Fig.  I .  The  mean  square  error  is 
measured  before  and  after  the  change  and  the  measurements 
compared.  If  the  change  causes  the  error  to  be  lower,  it  is 
accepted.  If  it  does  not,  it  is  rejected,  and  a  new  random 
change  is  tried.  This  procedure  can  be  described  algebraically 
as  follows : 

=  »»'*+  +  sgn  +  U,))]U, 

(113) 

where  is  a  random  vector;  is  an  estimate  of  mean 
square  error  based  on  N  samples  of  Cy  with  W  = 

+  U^)  is  an  estimate  of  mean  square  error  based  on 
N  samples  of  ej  with  IF  =  +  f/^;  and  sgn  {z)  is  +1 

for  2  ^  0  and  - 1  for  2  <  0. 

This  algorithm,  though  easy  to  implement,  has  the 
drawback  that  nothing  is  learned  when  a  trial  change  is 
rejected  and  forgotten.  For  this  reason  a  more  efficient 
“linear”  random  search  algorithm,  hereafter  called  the 
“LRS  algorithm,”  has  been  devised.  In  this  algorithm, 
first  described  here,  a  small  random  change  (/« is  tentatively 
added  to  the  weight  vector  at  the  beginning  of  each  iteration. 
The  corresponding  change  in  mean  square  error  perform¬ 
ance  is  observed.  A  permanent  weight  vector  change, 
proportional  to  the  product  of  the  change  in  performance 
and  the  initial  tentative  change,  is  then  made.  This  procedure 
can  be  expressed  algebraically  as  follows ; 

=  »»'*  +  /»[^(»»'*)  -  (114) 

where  is  a  random  vector  from  a  random  vector  generator 
designed  to  have  a  covariance  of  <r*/;^(IFj)  and  ^(IFj  +  t/») 
are  defined  as  in  (1 13);  and  the  terms  P  and  are  design 
constants  affecting  stability  and  rate  of  adaptation. 

The  LRS  algorithm  is  “linear”  because  the  weight  change 
is  proportional  to  the  change  in  mean  square  error,  and  in 
this  respect  it  differs  from  random  search  by  natural 
selection  as  described  in  (1 1 3).  The  latter  algorithm  is 
simpler  to  im|riement  but  does  not  perform  as  well.  It  is 
also  difficult  to  treat  mathematically,  and  a  performance 
analysis  is  not  attempted  in  this  paper. 

For  the  purpose  of  aiuilyzing  the  LRS  algorithm,  the 
following  definitions  are  useful.  The  true  change  in  mean 
square  error  resulting  from  the  addition  of  V|^  to  is 
given  by 

(AO*  A  W  +  f/*)  -  {(JF*).  (115) 

The  corresponding  estimated  change  in  mean  square  error  is 
A  ^(»F*  I/*)  -  ^IF*).  (1 16) 


The  error  in  the  estimated  change  is 

C*  A  (AO*  -  (^)*  (1 17) 

whose  variance,  from  (59),  is  given  by 
var  [CJ  =  var  [(AO*] 

=  var  [|(IF»  +  f/»)]  +  var  [^(IF»)] 

=  4  018) 

TV 

In  steady  state  operation  near  the  minimum  point  of  the 
mean  square  error  surface,  (1 18)  can  be  expressed  as 

var[C*]^l{L-  (119) 

A  perturbation  is  caused  by  the  tentative  changes  in  the 
weight  vector  that  are  a  part  of  the  LRS  algorithm.  At 
each  iteration,  N  samples  of  data  are  used  to  obtain  |(R*), 

with  the  weight  vector  set  at  its  nominal  value,  and  N 

samples  to  obtain  |(IF;k  +  ^k)-  Tlte  next  nominal  value  is 
chosen  immediately  after  the  two  |  measurements  are  made. 
During  a  given  cycle  the  average  excess  mean  square  error 
is  thus  given  by 

C(IF,)  -h  {(IF,  -H  C/*)j 

=  i£[«»F»)  -  OIF*  +  <I20) 

Since  (/*  has  zero  mean  and  is  uncorrelated  with  IFi^,  and 
since  cov  [l/^]  =  cov  [I/*']  =  a^l,  the  average  excess 
mean  square  error  can  also  be  expressed  as 

\E[UjRV^]  =  i£[l/*'^A(/*']  =  W  tr  *.  (121) 

The  perturbation  P  is  defined  as  the  ratio  of  the  average 
excess  mean  square  error  (resulting  from  tentative  changes 
in  the  weight  vector)  to  the  minimum  mean  square  error. 
It  may  thus  be  expressed  as 


I)  Stability,  time  constants  of  LRS  algorithm:  Equation 
(114)  may  be  rewritten,  using  (115),  (116),  and  (117),  as 
follows: 

PF»,,  =  IF»  -|-/I[-(A{)*  +C*]f/*  (123) 

or 

F*.,  =  F»  +/I[-(A{)*  +C*]f/*.  (124) 

If  one  lets  a*  be  small  by  design,  so  that  (/*  is  always  small, 
one  can  write 

(A{)*  =  Uj\^  =  2UJRV,.  (125) 

Substituting  (125)  into  (124)  then  yields 

F*.,  =  F*  PU,[-2UJRV,  +  C*] 

-  (/  -  2pU,UjR)V,  +  pli,U,.  (126) 

Equations  (1 14)  and  (126)  are  equivalent  representations 
of  tte  LRS  algorithm,  the  former  more  useful  for  im- 
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plementation  and  the  latter  for  analysis.  Equation  (126) 
shows  that  the  weight  vector  is  the  solution  of  a  first-order 
linear  vector  difference  equation  having  a  randomly  time- 
variable  coefficient  —2pi/^Ui^^R  and  a  random  driving 
function 

Both  sides  of  (126)  may  be  premultiplied  by  Q-'  to 
obtain  an  equivalent  expression  in  primed  coordinates : 

K.i  =  (/  -  2pu,'u,'^Ayy,'  +  (in) 

Though  this  expression  is  simpler  than  (126),  it  remains 
difficult  to  solve  because  of  cross  coupling  and  randomness 
in  the  matrix  coefficient.  It  is  thus  necessary  to  derive 
stability  conditions  for  the  LRS  algorithm  without  an 
explicit  solution  to  (127).  One  may  begin  by  studying  the 
behavior  of  the  mean  of  the  weight  vector. 

By  taking  expected  values  of  both  sides  of  (127)  and 
observing  that  f//  is  a  random  vector  uncorrelated  with 
and  Kj,  one  obtains 

i]  =  E[(I  -  2fiuyuy^\)yy]  +  /I£[(,f/;] 

=  (/  -  2fiE[uyuy^]\)E[yy\  +  0 

=  (I  -  2fia^\)E[yyi  (128) 

This  equation  is  analogous  to  (34)  for  the  method  of  steepest 
descent.  Its  solution  is 

^[  ►'*']  =  (/- 2/Iir^A)*Fo'.  (129) 

Equation  (129)  gives,  for  an  initial  condition  of  F*'  =  F®', 
the  expected  value  of  the  weight  vector’s  transient  response. 
Stability  of  (128)  assures  convergence  of  the  mean  of  F*'. 
The  stability  condition  is 


The  assumed  steady  state  covariance  of  the  weight 
vector  may  be  calculated  as  follows.  Multiplying  both 
sides  of  (127)  by  their  own  transposes  yields 

=  (/  -  2pu,'u,'^\)yyvy^(i  -  2fi\v,'u,'^) 
+ 

+  (/  -  2fiuyuy^\)yypi:,u,'^ 

+  -  2p\uyu,'^).  (134) 

Noting  that  and  Uy  are  stationary  processes  of  zero 
mean  uncorrelated  with  each  other  and  taking  expected 
values  of  both  sides  of  (134)  yields 

=  £•[(/  -  2/II/;i//^A)F*'F*'^(/  -  2/IAt/.'t/.'^)] 

+  +  0 
=  £[(/  -  2puyuy^\)yyy,'^(i  -  2p\u,'u,'^)] 

+  ^  (135) 

Since  in  steady  state  F*'  is  also  a  stationary  process  of  zero 
mean  uncorrelated  with  Uy,  one  may  write 

=  £[(/  -  2pu,'uy^\)E[yyyy^ji  -  2p\uyuy^)] 

+  ^  (136) 

and 


•/-U.  >  >  0.  (130) 

When  Pa^  is  so  chosen,  the  following  condition  is  fulfilled : 


lim  £[F*']  =  0.  (131) 


By  analogy  with  the  method  of  steepest  descent,  whose 
transient  behavior  is  characterized  by  (34)  through  (39),  the 
time  constant  of  the  pth  mode  of  the  expected  value  of  the 
weight  vector  is 


_J _ 

2Pa\' 


(132) 


The  time  constant  of  the  pth  mode  of  the  mean  square 
error  learning  curve  is  half  this  value: 


2)  Noise  in  the  weight  vector  of  the  LRS  algorithm:  If 
one  lets  po^  be  chosen  so  that  (130)  is  satisfied,  then  the 
mean  of  the  weight  vector  will  converge  according  to  (131). 
Convergenos  of  the  mean,  however,  does  not  necessarily 
imply  boundedness  of  the  covariance  of  the  weight  vector. 
For  the  purpose  of  obtaining  an  expression  for  the  noise 
in  the  weight  vector,  such  boundedneu  is  here  assumed 
without  proof.  It  is  also  assumed  that  the  weight  vector 
undergoes  a  sutionary  stochastic  proceu  after  initial 
adaptive  transientt  have  died  out. 


cov[F*'] 

=  £[(/  -  2puyuy^\)  cov  [Fj'](/  -  2p\U,'Uy^)] 

+  0’  ^  ti^‘i 

Jy 

=  cov  [F*']  -  2PE[U,'U,'^\  cov  [F/] 

-  2/lcov[F/]A£[l/;i/*'’^] 

+  4p^E[Uyuy^A  cov  [F;]AC/*'C/*'^] 

=  cov  [F*']  -  2pa^A  cov  [F*  ]  -  2pa^  cov  [F*']A 
+  4P^E[U^V,'^A  cov  [F*']AC/;t4'T 

+  ^  (137) 

Solving  (137)  to  find  the  covariance  of  F*'  is  difficult 
because  the  matrices  cannot  be  factored.  After  reexamining 
(1 30),  however,  one  could  argue  heuristically  that  in  steady 
sute  the  covariance  matrix  should  be  diagonal.  All  com- 
ponenu  of  the  driving  function  of  (127)  are  uncorrelated 
with  each  other  and  uncorrelated  over  time.  The  random 
coefficient  I  —  ipuyvy^A  is  furthermore  diagonal  on  the 
average,  though  generally  not  for  eadi  value  of  k,  and 
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uncorrelated  with  V^'  and  with  itself  over  time.  Though 
this  argument  does  not  constitute  a  proof  that  the  covariance 
of  V^'  is  diagonal,  it  makes  such  an  assumption  plausible. 

If  the  covariance  matrix  of  is  thus  assumed  to  be 
diagonal,  then  with  some  rearranging  of  terms  (137) 
becomes 

4/I«7^A  cov  [>'»']  -  cov  [F;]At/*'t/.'n 

=  (138) 

N 

For  slow  adaptation,  the  case  of  greatest  interest,  it  may  be 
note^  that 

/kr^A  «  /  (139) 

which  is  analogous  to  (75)  for  the  method  of  steepest 
descent.  *  One  may  note  further  that 

/l"£[l/;i/*'^A  cov  [F*']At/;t/*'T  S  cov  [F.'] 

(140) 

and  from  (139)  that 

cov  [F^']  «  cov  [F^'].  (141) 

The  term  -4^^£[  ]  of  (138)  is  thus  small  and  can  be 
neglected.  Equa  ion  (138)  accordingly  becomes 

cov[F»']  =  ^{i„A-'.  (142) 


this  time  constant  expressed  in  number  of  data  samples  is 
‘  ‘  2^, 

Note  the  difference  between  (146)  and  the  equivalent 
expression  (82)  for  the  DSD  algorithm,  reflecting  the 
difference  in  utilization  of  data  per  adaptive  cycle  by  the 
two  algorithi  IS. 

According  to  (146)  one  may  write 


-  A  (— ) 

(147) 

and 

A(— ) 

(148) 

2/'«‘ 

Inserting  (148)  into  (145)  yields 

A/  =  —  (— )  . 

(149) 

AP  \T  / 

This  formula  closely  resembles  its  counterpart  (87)  for  the 
DSD  algorithm. 

According  to  (89)  the  total  misadjustment  must  include 
the  effects  of  perturbation.  One  may  thus  write 


Though  this  expression  has  not  been  rigorously  derived, 
experience  has  shown  it  to  lead  to  misadjustment  formulas 
that  are  generally  accurate. 

3)  Misadjustment  of  LRS  algorithm:  The  average  excess 
mean  square  error  due  to  noise  in  the  weight  vector  is 
given  by  (77).  Using  (142)  one  may  write  for  the  LRS 
algorithm 

=  (143) 

According  to  the  definition  of  (80)  the  misadjustment  of 

the  LRS  algorithm  is  thus 

=  (144) 

/V 

This  result  can  be  usefully  expressed,  using  (121),  in 
terms  of  the  perturbation  of  ihe  LRS  process: 

„  ^  nfio*  tr  R  ^  n^fio  A,,  (145) 

2NP  2NP 

It  can  also  be  exprasHd  in  terms  of  time  constants  of  the 
adaptive  process.  The  time  constant  of  the  pth  mode  of  the 
learning  curve,  expressed  in  number  of  iterations,  is  given 
by  (132).  Since  2N  umples  of  data  are  used  per  iteration, 

*  The  rale  of  ^  in  the  method  of  Msipiii  dsswnt  is  the  mow  as 
that  of  in  the  LRS  algorithm.  indraendeHt  control  over  #  and  «* 

isnsesmsiy.haiimise,bseMSSo*cemwislnndditianmspstf»atioa. 


Optimal  choice  of  P  requires  that  both  right-hand  terms  of 
(150)  be  equal  and  that  P,  therefore,  be  one-half  the  total 
misadjustment  (91).  One  may  thus  further  write 


This  formula  once  again  closely  resembles  its  counterpart 
(92)  for  the  DSD  algorithm  and  is  further  indicative  of  the 
fact  that  many  behavioral  properties  of  the  LRS  algorithm 
resemble  those  of  steepest  descent  algorithms  despite  the 
difference  in  search  procedure. 

Other  random  search  algorithms  applicable  to  adaptive 
control  and  pattern  recognition  systems  have  been  described 
in  the  literature  [24]-[31].  These  algorithms  are  capable 
of  taking  advantage  of  performance  measurements  from 
previous  iterations  in  detemining  current  parameter  changes 
and  are  useful  in  searching  multimodal  performance 
surfaces.  They  tend  to  be  complicated  in  implementation 
and  nuthematical  description,  however,  and  have  not  been 
analyzed  to  determine  their  misadjustment  as  a  function  of 
rate  of  adaptation.  It  is  conjectured  in  this  regard  that  their 
behavior  may  be  somewhat  similar  to  that  of  the  LRS 
algorithm  and  that  their  convergence  dose  to  optinul 
points  is  relatively  slow  in  hi^  dimensional  spaces. 


V.  SuMMAKY  OP  Analytical  Results 

In  the  foregoing  sections  analytical  expressions  have  been 
derived  that  duiiacterize  the  performance  of  the  DSD 
and  LMS  algorithms,  baaed  on  the  method  of  steepest 
descent,  and  the  LRS  alforilhiii,  bamd  on  a  random  search 


WIDKOW  AND  MrCOOL;  ADAFTIVE  AljOCMTHhlS 


627 


TABLE  I 

Performance  CHARAcreRisncs  of  Adafuve  Algorithms 


USI>  alpoi  itlim 

1.MS  alporithm 

LRS  algorithm 

MisudjuAlmciil,  M 

4N«-’ 

ptr*  ' 

-t  •= 

'min 

1  \ 

.,/  1  \ 

n^/  1  \ 

"'■Vp.nJ., 

Perliuhahnn,  P 

^inii) 

0^  n  X,, 

^  ^min 

Total  mi^adjiiitmcitl. 

M  ♦  P 

M 

M  i  P 

Time  coiivtaiit  uf  mode 

In  numhei  of  adaptive 

1 

1 

1 

Iterations,  r- 

1  msc 

In  number  of  data 

N  n 

1 

N 

umples.  Tn 

i^mse 

4*iLp 

2o2(IXp 

Fig.  3.  Time  consunt  oT  adaptive  prooeu  as  function  of  number  of 
weightt  with  total  misa4iustmeni  A/w  Axed  at  10  perceni  (perturba- 
ikm  F  optimiaed  for  DSD  and  LRS  algorithms). 


procedure.  The  most  important  of  these  expressions  are 
presented  in  Table  I  in  a  manner  that  allows  the  three 
algorithms  to  be  readily  compared. 

The  principal  measur;  of  performance  is  the  misadjust* 
ment  Af,  adiich  is  the  penalty  arising  from  the  imperfect 
statistical  estimation  prooeu.  The  formulas  presented  show 
that  miudjustment  increases  with  speed  of  adaptation,  and 
this  result  can  be  taken  as  a  general  rule  of  adaptive  proces¬ 
sing.  For  a  given  real-time  speed  of  adaptation^  and  given 
number  of  adaptive  parameters,  however,  miudjustment 
varies  consideraUy  among  the  three  algorithms.  The  most 
efficient  in  this  respect  is  the  LMS  algorithm.  The  DSD 
and  LRS  algmithms,  whose  miudjustment  expreuions  are 
nearly  equivalent,  are  considerably  leu  efficient. 

Fig.  3  shows  the  relative  efficiency  of  the  three  algorithms 
by  plotting  the  required  adaptive  time  constant  as  a  function 
of  number  of  adaptive  weights  with  total  miudjustment 
H/iM  fiMd  at  10  percent.  The  eigenvalues  of  the  R-matrix 

'’The  basic  unit  of  time  in  digital  sjtslsnH  is  the  sampling  period; 
in  anatog  aystms  it  is  the  equivslent  Nyouist  sampliog  period  cor^ 
wspandhig  to  the  bandwidth  of  the  error  i^aaL 


are  auumed  to  be  equal,  and  the  value  of  the  total  mis- 
adjustment  for  the  DSD  and  LRS  algorithms  is  minimized 
according  to  (92)  and  (ISl).  It  is  readily  seen  that  for  a  large 
number  of  weights  the  DSD  and  LRS  algorithms  have 
similar  time  constants.  The  LMS  algorithm,  on  the  other 
hand,  has  a  much  smaller  time  constant. 

The  formulas  presented  in  Table  I  and  the  curves  of  Fig.  3 
provide  a  practical  tool  for  use  in  the  design  of  adaptive 
filters.  For  the  purposes  of  illustration  let  us  assume  that 
an  adaptive  digital  filter  with  10  weights  is  needed  for  a 
particular  application.  Let  us  further  auume  that  a  total 
miudjustment  of  10  percent  would  be  acceptable  and  that 
the  eigenvalues  of  the  R-matrix  are  essentially  equal.  For 
the  DSD  algorithm,  a  total  miudjustment  of  10  percent, 
according  to  (91),  yields  an  optimal  perturbation  of  S 
percent.  Thus  the  miudjustment  M  is 

This  equation  can  be  solved  by  substituting  the  appropriate 
values  of  n  and  P  to  obtain  the  average  reciprocal  time 
constant  in  number  of  data  umples: 

=2(10)-.  (153) 

Since  all  eigenvalues  are  assumed  to  be  equal,  there  is  only 
one  time  constant  associated  with  the  mean  square  error 
curve,  and  (153)  can  be  rewritten  as 

10* 

_  5000  data  umples.  (154) 

This  is  a  large  adaptive  time  constant  for  a  10-weight  filter. 

If  the  LMS  algorithm  is  used  instead  of  the  DSD 
algorithm,  then  there  is  no  perturbation  and  the  misadjust- 
ment  is 

M  •  2-^  (t^)  “  10  percent  (155) 
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which  yields  a  time  constant  of 

T'nnt  =  25  data  samples.  (156) 

This  is  a  much  mo'c  favorable  value.  Within  about  four 
time  constants  adaptive  transients  would  essentially  die 
out.  Settling  time  would  be  about  100  sampling  periods  or 
iterations. 

For  the  LRS  algorithm  one  must  once  again  allocate  one- 
half  the  total  misadjustment  to  the  perturbation  P.  The 
misadjustment  M  is  thus 

'‘'Pm..'** 

which  yields  a  value  of  the  time  constant  of 

T'mt  =  10000  data  samples.  (158) 

The  LRS  algorithm  thus  would  require  twice  the  settling 
time  required  for  the  DSD  algorithm.  Note  that  the  per¬ 
turbation  is  set  as  follows: 

p  =  0.05  =  (159) 

^iinln 

which  is  equivalent  to 

(160) 

To  set  ff*  for  the  random  vector  generator  one  would  need 
to  know  the  values  of  and  trace  R.  Approximate  valuer 
would  be  adequate  in  most  practical  circumstances. 

These  results  illustrate  the  efikiency  of  the  LMS  algorithm, 
which  has  been  shown  to  approach  a  theoretical  limit  for 
adaptive  algorithms  when  the  eigenvalues  of  the  it-matrix 
are  equal  or  close  to  equal  in  value  [32].'  There  are 
circumsunoes,  however,  where  the  LMS  algorithm  cannot 
be  used  and  where  the  DSD  and  LRS  algorithms  provide 
a  valuable  option.  An  example  is  included  in  the  applica¬ 
tions  describe  in  the  next  section. 

VI.  Expekimental  Results 

In  this  section  the  results  of  experiments  performed  by 
computer  simulation  are  presented.  These  results  show  the 
relative  performance  of  the  DSD,  LMS,  and  LRS  algorithms 
in  practical  circumstances  of  varying  complexity.  They  also 
provide  a  means  of  verifying  the  expressions  for  misadjust¬ 
ment  and  adaptive  time  constant  derived  in  the  preceding 
sections. 

A.  Modeling  Experiments 

Two  modeling  or  system  identification  problems  were 
simulated  by  computer  to  demonstrate  the  convergence  of 
the  three  algorithms  and  the  degree  of  correspondence 

*  The  gradient  and  performanoe  etliinetlon  methods  used  in  the 
DSD  and  LRS  ely>r)thmi  involve  taking  the  dMIerenoe  between  two 
Imss,  noisy  d-quantities.  Some  of  this  diilBrance  te  due  to  statistical 
fluctuation  (that  is,  to  a  change  in  dau  statistics  from  one  sample  to 
the  neat),  an  undadrable  efliect,  and  some  to  the  actual  weight  change, 
a  desirable  eflbct.  If  the  data  could  be  repeated  and  the  difference 
oonflned  to  the  latter  efliKt,  the  result  wotdd  be  a  reduction  in  the 
anioont  of  data  reqnired  and  a  much  better  estimate.  The  gradient 
estimation  technique  of  the  Lhtt  algorithm  b  equivalent  to  such 
"data  repoating,”  which  accouMa  for  its  inherent  encimey. 


''?i 


Fig.  4.  Modeling  a  fixed  delay  with  an  adaptive  filter. 

between  actual  and  theoretical  performance.  In  these 
simulations  an  adaptive  transversal  filter  with  four  weights 
was  used,  in  the  first  the  algorithms  were  required  to  con¬ 
verge  to  a  weight  vector  solution  that  modeled  the  impulse 
response  of  a  “digital”  filter  with  a  single  fixed  delay  A 
of  2~^,  where  z“‘  is  the  transfer  function  of  the  unit  delay. 
In  the  second  they  were  required  to  converge  to  a  solution 
that  best  approximated  the  infinite  impulse  response  of  a 
one-pole  recursive  digital  filter. 

1)  Modeling  a  fixed  delay:  Fig.  4  shows  the  experimental 
configuration  used  to  test  convergence  of  the  algorithms 
to  model  the  fixed  delay.  An  input  signal  n,,  composed  of 
independent  samples  of  white  noise  of  unit  power,  was 
routed  in  parallel  to  the  delay  filter  and  the  adaptive  filter. 
The  output  of  the  delay  filter  was  corrupted  by  a  second 
input  ni,  composed  of  independent  additive  white  noise 
with  a  power  of  0.5,  to  form  the  output  of  the  system  to  be 
modeled.  This  output,  the  desired  response  d^  of  the  adaptive 
process,  was  compared  with  the  adaptive  filter  output 
in  the  normal  way  to  form  the  error  signal  Cy. 

The  optimal  weight  vector  solution  W*  for  this  experi¬ 
ment  is  zero  for  all  weights  except  that  whose  tap  delay 
corresponds  lO  the  delay  A.  The  value  of  this  weight  is 
one.  Thus,  when  the  adaptive  process  has  converged,  the 
error  tj  is  the  noise  n^,  which  is  uncorrelated  over  time. 
The  minimum  mean  square  error  is  not  zero  but  has  a 
value  equal  to  the  power  of  the  noise  /ij-  In  addition,  because 
the  input  Hj  is  white  and  of  unit  power,  all  inputs  to  the 
weights  are  mutually  uncorrelated  and  of  unit  power.  The 
input  correlation  matrix  R  is  thus  equal  to  the  unit  matrix  /, 
and  all  eigenvalues  of  R  are  equal  to  one.  These  circum¬ 
stances  are  the  simplest  that  could  be  devised  to  test  the 
three  adaptive  algorithms. 

Fig.  5  shows  learning  curves  of  the  adaptive  process 
when  the  three  algorithms  were  implemented  with  a  fixed 
theoretical  time  constant  of  ^8  data  samples.  An 
individual  learning  curve  and  an  ensemble  average  of  32 
independent  learning  curves  are  presented  for  each 
algorithm.  The  averaged  curves  allow  the  misadjustment  of 
the  adaptive  process  to  be  experimentally  measured.'  The 

*The  meuurement  is  made  by  dividing  by  C«i*  the  difl’ermoe 
between  the  everage  value  of  uymptoik  mean  square  error  and 
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Fif.  5.  Resulu  of  (i^  d^ymodeiiiis  experiment  with  theoretical  time  constant  fixed  at  2048  dau  samoles 
(a)  Individinl  learnins  curves,  (b)  EnsWnble  averafes  of  32  learning  curves. 


„  ^  ^  TABLE  II 

1U8ULTS  or  Ftoon  Delay  Modeuno  Experiment  with  Theoxetical  Time  Constant 
Foced  at  2048  Data  Samples 


Total 

PerfurbitioB  P.  Mludjintment  M,  miodjiislmenl  M,o,.  Tln.-or.;iKjUiiiK- 
Conwntnctconmnn  Ptrctni  pcrtcni  percent  corisiann ... .. 

_ 1  ^  1  '  niic 


Altorilhm 

(1  X  lO'-’ 

1 

0^  X  10-^ 

Theor. 

Mnt. 

Theor. 

Meas. 

Theor. 

Meas. 

no.  of  data  samples 

DSO 

1  S.62S 
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2.21 

2  19 

4.42 

5.70 

653 

7.89 

2048 

LMS 

0,12207 

- 

- 

- 

0.048H 

n.05 

00488 

0.05 

:048 

LRS 

- 

o.s 

7.8125 

J.125 

J.I2 

6.25 

K.08 

9.375 

1 1.20 

:048 

“high-frequency"  variations  of  the  curves  representing  the 
DSD  and  LRS  algorithms  are  due  to  the  requited  perturba¬ 
tion  of  the  weight  vector  at  each  iteration.  At  the  beginning 
of  each  experiment  all  adaptive  weights  were  set  to  zero. 

Table  II  presents  the  theoretical  and  measured  values  of 
perturbation  and  misadjustment  for  the  learning  curves  of 
Fig.  5.  Alto  shown  are  the  values  of  the  parameters  n,  fi, 
aad  a*.  It  it  laadily  teen  that  the  theoretical  and  measured 
values  are  in  dose  afreement  for  all  three  algorithms. 

Fif.  6  preaaots  indMdnal  learning  curves  and  cneemble 
•wMnfn  of  32  laarnint  curvet  diowing  convergence  of  the 


three  algorithms  with  a  fixed  theoretical  total  misadjustment 
of  9.375  percent.  Table  III  shows  the  values  of  pertur¬ 
bation,  misadjustment,  and  time  constant  together  with  the 
values  of  the  parameters  ft,  fi,  and  <r*.  Once  again  close 
agreement  between  the  theoretical  and  experimental  results 
is  observed. 

2)  Modeling  a  one-pok  recursive  filter;  Fig.  7  shows  the 
experimental  configuration  for  the  second  modeling 
experiment.  An  input  n,  composed  once  again  of  independent 
samides  of  white  noise  of  unit  power,  is  routed  in  parallel 
to  an  adaptive  transversal  filter  and  a  one-pole  recursive 
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Fig.  6.  Results  of  fixed  ddsy  modeKng  experiment  with  theoretical  total  mitadjustinent  fixed  at  9.373  percent, 
(a)  Individual  leaning  curves,  (b)  Ensemble  avenges  of  32  fearning  curves. 


TABLE  m 

RauLia  OP  Ftono  Dblay  MooeuNO  ExpniMBta'  wim  THEoaencAL  Total  MoAiMusniENr  Af,M 

Fdv>  at  9.373  PBacENT 


T'JUl 

Perturbtlion  f .  Miadjuttment  H,  miudjuument  M|q|.  Theoretical  time 

Conwjence  coiwUnn  percent  percent  percent  constant 


AlEorilhm 

II X  i6*2 
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X  10'^ 

Thfor. 

Mcm. 

Theor 

Men. 

Tbeor. 

Men. 

no.  of  data  lamplet 

OSD 

3.121 

. 

3.I2S 

3.11 

6.25 

t.26 

9.375 

11.37 

1024 

IMS 

2.34 
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- 
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10.35 

9J75 
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7.1121 

3.125 
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8.08 
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11.22 

2048 

difkal  fiHcr  wbooe  tiaagfer  is  1/(1  -  ar~').  The 

ompt  of  tbg  oiw>pole  miir  b  the  tkiired  response  tip 
whidi  b  coibhiBid  with  the  adoptive  Alter  output  to 
■  piuduoa  dn  orror  9j, 

la  dda  aapariigt  dn  fouMve^t  adaptive  Alter  b 
attsaytifit  d>  fifiodai  a  oaapote  Alter  with  an  iaAaite 


impulse  response.  Since  the  input  n  is  white  noise,  the 
optimal  solution  is  to  cause  the  adaptive  Alter’s  impulse 
response  to  match  the  one-pole  Alter’s  feometrical  impulse 
response  to  the  extent  ailoi^  by  the  length  of  the  adaptive 
tapped  deby  line.  A  residual  mean  square  error  will  be 
present  because  the  best  match  attainabb  is  imperfect. 
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In  thii  die,  nten  the  adaptive  filter  has  convened  to  the 
opdnal  solution,  the  error  s^  will  be  correlated  over  time. 
This  latter  condition  violates  one  of  the  assumptions  on 
vdiich  the  previous  derivations  of  misadjustment  and  time 
constant  ufcre  based  and  can  be  expected  to  aflect  the 
agnenMot  between  theoretical  and  measufud  misa4jnstn^ 
andthneconetant 

Fig.  t  ibowB  individual  and  averaged  leiuniing  curves  of 
the  adaptive  process  with  a  fixed  theoretical  total  fflisadjust* 


ment  of  7.5  percent  for  the  DSD  and  LRS  algorithnu 
a^  of  0.75  percent  for  the  LMS  algorithm.  Note  the 
diffinence  in  time  scales  and  the  rapid  convergence  of  the 
LMS  algorithm.  Table  IV  presents  the  values  of  perturba¬ 
tion,  mte4justment.  and  time  constant  and  of  the  con¬ 
vergence  parameten.  It  may  be  seen  that  the  measured 
misacyuetment  is  approximately  twice  the  theoretical 
misa<Qustment  for  the  DSD  and  LRS  algorithms.  For  the 
LMS  algorithm,  however,  meuured  and  theoretical  mis- 
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TABI  E  IV 

Rfsui  ts  OF  Onf-Poi  f  I  ii  tfr  Modfmno  I  xpiriment  with  Thforftk'al  Total  Mi.sadjustment  Mm 
Fixf.d  at  7.5  Pfrc  i  nt  for  DSD  and  I,RS  Auioritiims  and  0.75  Pf.rcent  for  LMS  Algorithm 


Total 
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fhctit 

Mpas. 
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10..^  I 

7.5 
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0.75 

o.n 
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l.Mfi? 
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2.67 

5 

9.23 

7.S 

1 1.90 
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adjustment  are  in  close  agreement.  The  results  for  the 
DSD  and  LRS  algorithms  arc  expected  and  can  be  attributed 
to  the  fact  that  the  correlation  in  the  error  over  time 
makes  the  effective  statistical  sample  size  less  than  the  actual 
number  of  error  samples.  The  reason  that  the  LMS  algorithm 
is  not  sensitive  in  this  respect  and  docs  not  experience  :i 
loss  in  performance  is  not  understood  at  the  present  time 
and  is  a  subject  under  investigation. 

This  experiment  and  the  foregoing  fixed  delay  experiment 
demonstrate  that,  in  accordance  with  the  theoretical 
expectation,  the  performance  of  the  LMS  algorithm  is 
superior  to  that  of  the  DSD  and  LRS  algorithms,  whose 
performance  is  approximately  equivalent.  The  LMS 
algorithm  converges  more  rapidly  for  a  given  level  of 
misadjustment  or  is  less  noisy  (produces  less  misadjustment) 
for  a  given  rate  of  adaptation.  For  the  DSD  and  LRS 
algorithms  the  relationship  between  rate  of  adaptation  and 
misadjustment  is  known  approximately  for  a  wide  variety 
of  input  statistical  conditions.  For  the  LMS  algorithm  the 
relationship  under  the  same  variety  of  input  conditions  is 
known  to  a  closer  approximation. 

B.  Adaptive  Cancelling  of  Sidelobe  Interference  in  a 
Receiving  Antenna  Array 

The  objective  of  this  experiment  is  to  demonstrate  one  of 
the  ways  in  which  adaptive  filtering  can  be  applied  to 
reduce  interference  received  by  the  sidelobes  of  an  antenna 
array.  Results  are  presented  only  for  the  LMS  algorithm. 
The  DSD  and  LRS  algorithms  could  also  be  used  with  this 
problem,  but  their  performance  would  not  equal  that  of  the 
LMS  algorithm,  as  indicated  by  the  formulas  and  ex¬ 
perimental  results  already  presented.  An  experiment  where 
the  DSD  and  LRS  algorithms  are  applied  to  a  problem  that 
cannot  be  solved  by  the  LMS  algorithm  is  presented  in  the 
next  section. 

A  number  of  adaptive  beamforming  methods  capable 
of  reducing  interference  in  the  siddobes  of  an  antenna 
array  have  been  described  in  the  literature  [l]-[IO].  These 
methods  have  the  diudvantage  that,  unless  the  adaptive 
procen  is  constrained,  strong  signal  ciHnponents  in  the 
main  beam  are  rejected.  When  the  adaptive  process  is 
constrained  the  signal  is  preserved,  but  there  may  be  a  loss 
in  array  performance  caused  by  gain  or  phase  errors  due 
to  nonimtformity  in  dement  phneuent,  transfer  function, 
or  near*fleld  eflects. 
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Fig.  9.  Block  diagram  of  null-consireined  adaptive  beamfomier 
tolerani  of  array  element  gain  and  phase  errors. 


By  the  use  of  adaptive  noise  cancelling  techniques'”  it 
is  possible  to  realize  a  constrained  adaptive  beamformer 
that  does  not  suffer  a  significant  loss  in  performance  when 
array  element  properties  are  not  uniform.  This  beamformer, 
described  here  for  the  lint  time,  is  capable  of  reducing 
broadband  and  narrowband  interference  in  the  sidelobes  of 
an  antenna  array  without  rejecting  broadband  signal 
components  in  the  main  beam,  regardless  of  their  strength. 
It  is  also  simple  and  easy  to  implement 

Fig.  9  is  a  block  diagram  of  the  constrained  adaptive 
beamformer.  An  array  of  receiving  elements  is  connected 
to  a  conventional  time  delay  and  sum  beamformer,  which  is 
steered  in  the  direction  of  the  signal.  The  conventional 
beamformer’s  output,  containing  signal  and  interference, 
forms  the  primary  input  to  an  adaptive  noise  canceller.  This 
input  is  delayed  by  an  amount  A/2,  where  A  is  defined 
below,  to  form  the  desired  response  dj  of  the  adaptive 
process.  Multiple  reference  inputs  to  the  noise  cinoeller  are 
derived  by  taking  the  deiay^  element  outputs  from  the 
conventional  beamformer  before  summation.  These  inputs 
are  routed  to  a  bank  of  adaptive  transversal  filters,  each 
comprising  a  tapped  delay  line  with  a  total  delay  of  A. 
The  filter  outputs  are  summed  to  form  a  single  output  yj, 
which  is  subtracted  from  dj  to  obtain  the  canceller  output  Zj. 

'*  Adaptive  noise  cancelling  133]  h  a  fonn  of  opliinnl  filtering  that 
makes  use  of  two  inputs,  a  “pnmaiy”  input  coiuisting  of  siad  and 
noise  and  a  "tefcience**  input  comating  of  noiN  conelnied la  some 
unknown  way  with  that  in  the  primaiy  input.  The  lefetenw  input  ie 
adaptively  fiheied  and  subtracted  from  the  primary  input  to  obtain 
a  aiinai  aafimale  in  many  caiai  sunarior  to  that  obtainable  by  other 
Ibrina  of  adaptive  or  oonveiidonal  flitiring. 
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F'lJ®-  Weighting  coetficient  matrices  for  null<onstniined  adaptive 
beamrormer.  (a)  Single  column-of-zeros  constraint,  (b)  Triple 
column-of-zeros  constraint,  (c)  “He .  rglass”  constraint. 


This  output  also  provides  the  "error”  signal  tj  for  the 
adaptive  process. 

The  operation  of  the  adaptive  beamformer  of  Fig.  9  is 
constrained  by  constraining  the  weighting  coefficients 
(gains)  of  the  adaptive  filter  taps.  Fig.  10  shows  three  forms 
of  constraint,  each  suitable  for  a  different  purpose.  Fig. 
10(a)  represents  the  matrix  of  coefficients  appropriate  for 
an  ideal  line  array  with  a  plane-wave  signal  incident  in  the 
"look”  direction  of  the  conventional  beamformer.  The  gain 
of  the  central  taps  is  constrained  to  be  zero.  The  gains  w 
of  each  of  the  other  taps  are  independently  controlled  by 
the  adaptive  process.  Note  that  the  matrix  has  as  many 
rows  as  there  are  reference  inputs. 

In  this  problem  the  signal  appearing  at  the  central  tap  of 
each  adaptive  filter  is  identical  except  in  scale  to  dj.  If  one 
assumes  that  the  received  signal  is  "white”  and  has  an 
inqwlsive  autocorrelation  function,  the  signals  appearing 
at  the  other  Ups  will  be  uncorrelated  with  dj.  It  is  thus 
apparent  that  the  signal  components  in  yj  will  be  uncor¬ 
related  with  those  in  dj  and  that  the  adaptive  process  will 
have  no  tendency  to  cancel  the  received  broadband  signal. 
Interference  components  arriving  from  other  than  the 
look  direction,  on  the  other  hand,  will  be  correlated  with 
the  interference  componenu  in  dj  at  one  or  more  of  the 
unconstrained  Ups.  These  componenu  will  thus  be  cancelled 
by  the  adaptive  process,  which  adjusU  the  pin  of  the  un¬ 
constrained  Ups  to  minimize  the  mean  square  of  the  error 
9j  (in  this  case,  output  power). 

In  practical  ap^ications  arrays  with  idcnl  properties 
cannot  be  realized  because  perfect  recemag  dements, 
perfect  element  piacement,  and  freedoas  from  aear-fieid 
irregularities  cannot  be  achieved.  Fig.  10(b)  shows  a  form 
of  constraint  proposed  to  desensitize  the  behavior  of  the 
adaptive  sidd^  canceller  to  impmfectioas  in  the  properties 
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of  the  receiving  '.Icmcnts.  Tnis  constraint  consists  of 
inserting  an  additional  column  of  zeros  on  cither  side  of  the 
central  column.  Fig.  10(c)  shows  a  configuration  of  the 
weighting  coefficients  that  would  allow  the  reception  of 
strong  broadband  signals  over  a  finite  and  controllable 
angular  sector;  in  this  configuration  the  zeros  are  arranged 
in  the  form  of  an  “ho arglass.” 

Fig.  II  shows  direciional  response  patterns  obtained  by 
computer  simulation  that  indicate  the  performance  of  the 
adaptive  beamformer  o"  Fig.  9  with  an  ideal  and  a  nonideal 
array  using  the  single  and  triple  “column-of-zeros”  con¬ 
straints  The  ideal  array  consists  of  ten  elements  in  a  linear 
cemfiguration  and  with  half-wavelength  spacing  at  the 
sampling  frequency;  for  the  nonideal  array  the  single 
elements  at  each  end  of  tie  array  arc  moved  forward  one- 
quarter  of  a  wavelength.  The  simulated  received  signal  has  a 
power  of  one,  a  white  spectrum,  and  originates  from  a  point 
source.  The  simulated  interference  is  isotropic,  with  a 
power  of  0.01  and  a  white  spectrum.  The  directional 
response  of  the  conventional  time  delay  and  sum  beam- 
former  is  shown  as  a  dotted  line  for  purposes  of  comparison. 

Fig.  11(a)  represents  the  adaptive  beamformer’s  per¬ 
formance  with  the  ideal  array  and  the  single  column-of- 
zeros  constraint,  while  Fig.  11(b)  represents  performance 
with  the  nonideal  array  and  single  column-of-zeros  con¬ 
straint.  Note  that  the  beam  formed  is  “super-directive” 
—that  is,  much  narrower  than  the  conventional  beam— but 
severely  reduced  in  sensitivity  when  array  properties  are  not 
ideal. 

Fig.  11(c)  and  Fig.  11(d)  show  beamformer  performance 
with  the  triple  column-cf-zeros  constraint.  In  this  case  the 
adaptive  beam  is  close  in  width  to  the  conventional  beam, 
and  its  sensitivity  is  not  affected  by  element  irregularity. 
Even  at  high  signal-to-noise  ratios  sensitivity  is  sustained 
over  a  finite  range  of  angles,  an  unusual  result  since  adaptive 
beamfon.iers  generally  lose  signals  not  incident  exactly  in 
the  “look”  direction. 

C.  Adaptive  Phase  Control  of  a  Transmitting 
Antenna  Array 

This  experiment  illustrates  the  use  of  the  DSD  and  LRS 
algorithms  to  solve  a  problem  that  cannot  be  solved  with 
the  LMS  algorithm."  The  problem  selected,  adaptive 
phase  control  of  a  transmitting  array,  is  representative  of  a 
class  of  problems  more  general  than  those  heretofore 
treated  in  this  paper.  Other  problems  of  a  similar  nature 
include  adaptive  adjustment  of  the  parameters  of  microwave 
resonators,  waveguides,  and  coaxial  transmission  lines.  A 
related  problem  at  optical  frequencies  is  adaptive  adjust¬ 
ment  by  controlled  warping  of  laser  mirrors. 

It  should  be  noted  that  the  formulas  for  time  constant, 
perturbation,  and  misadjustment  of  the  DSD  and  LRS 
algorithms  given  in  Table  I  were  derived  by  assuming 
sUtionary  stochastic  inputs  to  an  adaptive  system  so 
configured  that  mean  square  performance  is  a  quadratic 

"  In  the  form  detcribed  in  thK  paper  U'le  LMS  alforithm  can  be 
wed  onlyjo  adjwt  variable  vreighti.  The  DSD  and  l.RS  algorithm 
do  not  tuffier  from  this  limiuition. 
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Fig.  11.  RrhiIu  of  adaptive  beamfortning  experiinent.  (a)  Single  colunui-of-zeroa  conttraint,  ideal  array,  (b)  Single 
coluiniKif>»roa  constraint,  nonideal  array,  (c)  Triple  column^-zerot  constraint,  ideal  array,  (d)  Triple  column^f- 
aeros  oonstraint,  nonidcal  array. 


90 


90 


SATELLITE 


Fig.  12.  SaNUHa  transmitting  information  to  receiver  on  earth. 

function  of  the  itijustgUe  pnimmeten.  The  conditioiu  on 
which  theee  fonnulu  and  proof  of  converpenoe  are  based 
are  not  satisfied  in  the  adaptive  phase  control  proUem 
eaamined  here.  If  one  ignores  the  detenninistic  nature  of 
the  sinusoidal  imwt  signals  and  treats  input  power  as  in 
the  stoduwtic  case,  however,  the  expressions  of  Table  I 
provide  predictions  home  out  w^  by  experimental 
simulntion. 

Rg.  12  shows  a  typical  application  for  a  transmitting 
anay  with  adaptive  phnsa  control.  A  satellite  is  idaying 
infetaation  over  a  la^  diatanoe  to  a  receiver  on  the  earth. 


The  power  available  to  drive  the  transmitter  is  limited,  and 
it  is  desirable  for  maximum  power  transfer  to  keep  the 
main  beam  of  the  transmitting  antenna  optimized  and 
steered  toward  the  receiving  station,  whose  position  with 
respect  to  the  satellite  changes  with  the  earth’s  rotation  and 
the  satellite’s  orientation.  The  array’s  elements  need  not  be 
ideal.  It  is  assumed  that  the  power  of  the  received  signal  can 
be  measured  or  estimated  and  transmitted  via  a  feedback 
link  to  the  satellite  for  use  as  an  input  to  an  adaptive  beam- 
forming  process.  To  avoid  a  lou  of  signal  power  that  would 
partially  or  wholly  offset  the  directional  gain,  the  beam¬ 
forming  process  must  control  the  output  phase  rather  than 
the  gain  of  the  satdlite  antenna’s  elements. 

Fig.  13  it  a  block  diagram  showing  the  OKNlel  used  to 
simulate  an  adaptive  transmitting  antenna  array  of  n 
elements.  The  tig^  is  represented  by  a  sine  wave  produced 
by  a  signal  generator.  An  array  of  n  phase  compensators 
governed  by  an  adaptive  algoritlun  represents  the  adaptive 
processor.  A  correqKmding  array  of  n  phase  shifters  provides 
a  means  of  simulating  the  unknown  phase  shifts  between 
the  antenna  elements  and  the  reoeiier.  The  outputs  of  the 
phase  shifters  are  summed  and  iqjected  with  “receiver” 
noise  to  simulate  a  weak  received  signal.  This  signal  is 
sampled,  squared,  and  averaged,  providing  a  power  estimate 
for  the  adaptive  algorithm.  The  algorithm  adjusts  the  phase 
compensators  to  maximiie  measured  power.  It  is  dear  that 
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Fig.  13.  Digittl  simulation  of  adaptive  transmitting  antenna. 


maximum  power  will  be  transmitted  when  the  combined 
phase  shifts  on  each  branch  of  the  block  diagram  are 
integral  multiples  of  360  degrees  relative  to  each  other. 
Although  there  is  no  unique  solution  to  the  problem,  there 
are  families  of  equivalent  solutions  that  provide  maximum 
power  transfer. 

This  model  comprises  all  aspects  of  the  Satellite  trans¬ 
mission  example  d^cribed  ibove  except  the  two-way  time 
delay  of  the  transmission  path.  This  delay  would  affect  the 
rate  of  adaptation  of  the  processor  and  would  have  to  be 
taken  into  account  in  designing  a  real  system. 

Fig.  14  shows  Ipaming  curves  of  the  adaptive  process 
for  the  DSD  and  LRS  algorithms  when  the  injected  noise 
of  Fig.  13  is  set  equal  to  zero.  The  transmittihg  antenna  was 
composed  of  16  isotropic  elemenu  in  a  line  array.  Note  that 
the  curves  rise  to  an  asymptote  representing  maximum 
power  rather  than  decaying  toward  a  minimum.  Note 
further  that  they  are  not  exponential  except  as  the  optimal 
solution  is  approached.  Exponential  leamihg  curves  occur 
only  when  the  algorithms  are  applied  to  quadratic  per- 
fonnance  surfaces.  The  performance  surface  for  the 
simulate  problem  is  a  represenUiion  of  output  power  as 
a  function  of  fdiase  and  is  not  quadratic  except  near 
stationary  points,  where  it  can  be  represented  by  first-  and 
second-degree  terms  of  a  Taylor  expansion.*^  For  this 
application  the  method  of  steepest  descent  might  better  be 
designated  the  “method  of  steepest  ascent.”  It  is  described 
by  (24)  with  the  sign  of  p  reversed.  A  corresponding  reversal 
of  sign  is  also  required  in  applying  the  LRS  algorithm  to 
this  problem. 

The  "theoretical”  time  constant  of  both  learning  curves 
of  Fig.  14  is  128  data  samples.  This  value  is  based  on  the 
characteristics  of  the  performance  surface  (that  is,  itt 


ihowB  lijr  M.  K.  Leavitt,  in  a  June  1975  tenn  papn 
EE  373,  AdsMiya  ffmliins,  ia  the  Dspartmeat  ol 
at  8taaM  Univinity,  that  the  eeffeimaiHi 
ttmm  rnalilalas  mail  efcasiass  of  mmtti  k 

#ohal  and  nlathe 


.The! _ 

rhaad,iiiayhB«saai 

teas,  thouah  not  ( 
ithisaiipaciatlon. 


t 


max 


(t) 


max 


1536  ^  2048 


(b) 

Fig.  14.  Learning  curves  of  simubied  adaptive  transmitting  antenna 
without  noise,  (a)  DSD  algorithm,  (b)  LRS  algorithm. 


“R-matrix”)  in  the  vicinity  of  the  global  optimum.** 
Visual  inspection  indicates  that  the  actual  time  constants 
of  the  two  curves  are  similar  and  agree  well  with  the  above 
value.  The  convergence  parameter  x  for  the  DSD  algorithm 
was  8  X  10"’.  The  convergence  parameters  and  o* 
for  the  LRS  algorithm  were  1  and  8  x  10~*,  respectively. 
The  maximum  transmitted  power  was  equal  to  32.  The 
“perturbation”  P  for  both  algorithms  was  5  percent,  and 
the  value  of  N  was  one. 

Fig.  1 5  shows  sequences  of  radiation  patterns  correspond¬ 
ing  to  the  learning  curves  of  Fig.  14.  Real  time  is  indicated 
in  terms  of  data  samples  equivalent  to  sampling  periods  of 
the  digital  system  of  Fig.  13.  The  simulated  receiving  site 
was  located  at  a  relative  angle  of  20  degrees.  The  initial 
letting  of  the  phase  compenutors  was  zero.  The  unknown 
phase  settings  of  the  phase  shifters  were  chosen  at  random. 
Note  the  rapid  formation  of  the  main  lobe  at  20  degrees  and 
the  suppression  of  sidelobes. 

Fig.  16  shows  learning  curves  of  the  adaptive  process 
when  independent  samples  of  white  noise  with  a  power  of 
0.01  were  iiyected  into  the  simulated  received  signal.  Array 
configuration  and  adaptive  parameten  are  the  same  as  in 
the  noiseless  case  represented  by  Fig.  14.  As  well  as  can  be 
determined  by  visual  inspection,  the  actual  time  constanU 
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for  both  aliorttluM  uc  alio  approiuinately  the  rame  m  in  Though  the  unMopriate  formulas  have  not  yet  been  derived, 
the  BOiieleM  case.  the  formulas  for  stochastic  inputs  and  quadratic  perform- 

Noise  in  the  adaptive  phase  control  process,  as  evident  in  ance  surfaces  would  suggest  that  with  equal  theoretical  time 
Fig.  16,  causes  a  steady-slate  average  loss  of  array  power  consunts  the  misadjustment  of  the  LRS  algorithm  would  be 
gain.  One  can  define  for  this  case  a  form  of  misadjustment  greater  than  that  of  the  DSD  algorithm.  This  expectation  is 
that  Is  a  ntio  of  the  loss  in  powsr'to  the  peak  power  confirmed  by  the  resulu  obttinsd  in  this  experiment. 
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Fig.  16.  Learning  curves  of  simulaied  adaptive  transmitting  antenna 
with  noise,  (a)  DSD  algorithm,  (b)  LRS  algorithm. 


VII.  Conclusion 

The  theoretical  and  experimental  results  presented  in  this 
paper  show  that  the  LMS  algorithm  is  the  most  efficient  by 
a  large  factor  of  the  three  algorithms  compared  and 
indicate  that  it  should  be  used  whenever  circumstances 
permit.  The  DSD  algorithm  is  less  efficient  than  the  LMS 
but  more  efficient  by  a  factor  of  two  than  the  LRS  algorithm. 
Its  use  is  appropriate  where  technical  or  economic  con¬ 
siderations  preclude  use  of  the  LMS  algorithm  or  where 
a  high  speed  of  adaptation  is  not  required.  Use  of  the  LRS 
algorithm  may  appropriate  in  cases  where  the  per¬ 
formance  surface  for  the  adaptive  process  is  not  well 
behaved  and  has  both  local  and  global  optima.  Further 
experience  is  required,  however,  to  confirm  that  the  random 
weight  vector  changes  associated  with  this  algorithm  can 
provide  an  advantage  in  the  presence  of  local  optima  that 
may  slow  or  prevent  global  convergence  of  algorithms 
based  on  the  method  of  steepest  descent.  Further  work  is 
also  required  to  extend  the  theoretical  derivations  for  time 
constant  and  misadjustment  of  the  three  algorithms  to 
applications  other  than  those  entailing  stochastic  inputs 
and  quadratic  performance  surfaces. 
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